Hi Dmitri, hi all,

> move the `@MeterTag(key="realm_id",expression="realmIdentifier")` annotation 
> to the method level.

I tried that as well, and even if it's possible to place the
annotation at method level, it seems that doing so is not supported:
the expression resolver doesn't get invoked at all. You can try the
RealmIdTagEnabledMetricsTest and you will notice that the tests will
fail.

Thanks,

Alex


On Tue, Jun 10, 2025 at 5:59 AM yun zou <yunzou.colost...@gmail.com> wrote:
>
> Hi Dmitri,
>
> Thanks a lot for the clarification! I didn't realize
> PolarisMetaStoreManager is a request-scoped bean.
>
> I have to say that the whole implementation is kind of confusing. Although
> I do agree that those
> *MetastoreManagerImpl is not needed after the request ends, i am afraid
> things may not be that simple, I see
> the EntityManager is also created in a similar way, which is created per
> realm, and cached in a map.
> Code reference:
> https://github.com/apache/polaris/blob/main/service/common/src/main/java/org/apache/polaris/service/config/RealmEntityManagerFactory.java#L56
>
> The EntityManager has a pointer to MetastoreManager, entityCache and
> credentialCache, so you might also need to turn
> EntityManager into a pure request scope also, and I am not sure if there
> are other places that also have similar situations.
>
> It seems that if we want to turn the Metastore Manager into a request
> scope, things could turn out to be more complicated than
> what we thought, and we probably need to be careful about this.
>
> Best Regards,
> Yun
>
> On Mon, Jun 9, 2025 at 3:32 PM Dmitri Bourlatchkov <di...@apache.org> wrote:
>
> > Thanks for stating your concerns, Yun!
> >
> > Re: MetastoreManager, I believe the current state of that code is actually
> > the source of confusion and I'm trying to clarify that.
> >
> > For example, TransactionalMetaStoreManagerImpl is currently created per
> > realm. However, this class does not carry any state. It is an empty shell
> > that works off of the data passed as PolarisCallContext parameters. The
> > same holds for AtomicOperationMetaStoreManager.
> >
> > As such, PolarisMetaStoreManager can be request-scoped beans. I do not see
> > any reason to keep those objects after the request is done.
> >
> > WDYT?
> >
> > Thanks,
> > Dmitri.
> >
> > On Mon, Jun 9, 2025 at 5:57 PM yun zou <yunzou.colost...@gmail.com> wrote:
> >
> > > Hi Dmitri,
> > >
> > > Thanks for bringing that up!
> > >
> > > I am not an expert with CDI, based my recent experience with CDI, while
> > CDI
> > > does help making development
> > > simpler under many cases, it does also make things more complicated under
> > > some scenarios. Especially when
> > > injecting a request scoped object into application scoped class, it does
> > > make things harder to reason about when
> > > debugging, and it also seems we currently lack efficient ways of testing
> > to
> > > catch potential regressions. For those
> > > cases, i think explicitly passing the objects as parameters is actually
> > > more clear and robust.
> > >
> > > Furthermore, I think this could make things easier for new developers for
> > > OSS polaris.
> > >
> > > Also, regarding the metastore object, i don't think i fully understand
> > the
> > > proposal, are we proposing to make the
> > > MetastoreManager now request scoped? Currently, the metastore manager is
> > > created per realm, not per
> > > RealmContext, which means two requests with the same realmId can use the
> > > same metastore manager. If we are turning
> > > this class into a request scope, I am not sure if that would be a good
> > > idea, especially when we are dealing with Entity Caching
> > > optimizations, where caching per realm instead of per request probably
> > > makes more sense.
> > >
> > > Best Regards,
> > > Yun
> > >
> > >
> > >
> > > On Fri, Jun 6, 2025 at 2:47 PM Yufei Gu <flyrain...@gmail.com> wrote:
> > >
> > > > Agreed with Alex. I think passing a realm context is a cleaner solution
> > > > than injecting a request scope realm context. CDI provides a lot of
> > > > benefits, but it doesn't mean that we should alway use it.
> > > >
> > > > In general, injecting a request-scoped bean into an application-scoped
> > > bean
> > > > is perfectly valid in CDI, but it does come with a few gotchas:
> > > >
> > > >    1. Lifecycle surprises. An @ApplicationScoped method can run in
> > > contexts
> > > >    where no request scope is active (e.g., async tasks, scheduled
> > jobs).
> > > In
> > > >    those cases the injected proxy can’t resolve a real instance,
> > leading
> > > to
> > > >    ContextNotActiveException or, worse, stale data.
> > > >    2. Debugging pain, When failures depend on the presence or absence
> > of
> > > a
> > > >    request context, stack traces alone rarely tell the full story.
> > > Tracking
> > > >    down intermittent bugs becomes tricky.
> > > >    3. Unnecessary complexity, which is a contradiction of using
> > > injection.
> > > >    Injection is supposed to make a simpler and cleaner code base. If
> > you
> > > >    really only need a bit of request data, passing it as a method
> > > parameter
> > > >    keeps the code simpler, avoids proxy lookups, and makes unit tests
> > > > cleaner.
> > > >
> > > > As a rule of thumb, we may inject a narrower scope only when we must.
> > > >
> > > > Yufei
> > > >
> > > >
> > > > On Fri, Jun 6, 2025 at 11:15 AM Dmitri Bourlatchkov <di...@apache.org>
> > > > wrote:
> > > >
> > > > > Thanks, Alex! Very good points!
> > > > >
> > > > > Re: metrics, I did a quick POC by adding `@Inject RealmContext rc`
> > > > > to QuarkusValueExpressionResolver and it appears to work. That
> > > > > is, QuarkusValueExpressionResolver can get the realm ID from the
> > > injected
> > > > > RealmContext as opposed to the API parameter.
> > > > >
> > > > > With that, we could probably change the interpretation of the
> > > > > "realmIdentifier" expression to be the injected RealmContext, remove
> > > > > explicit realm parameters from the generated API classes and move the
> > > > > `@MeterTag(key="realm_id",expression="realmIdentifier")` annotation
> > to
> > > > the
> > > > > method level.
> > > > >
> > > > > WDYT?
> > > > >
> > > > > Thanks,
> > > > > Dmitri.
> > > > >
> > > > > On Fri, Jun 6, 2025 at 12:41 PM Alex Dutra
> > > <alex.du...@dremio.com.invalid
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Dmitri,
> > > > > >
> > > > > > Thanks for bringing up this important topic.
> > > > > >
> > > > > > I generally agree with your refactoring proposal, but with 2
> > caveats:
> > > > > >
> > > > > > 1) For an application-scoped bean exposing methods to retrieve
> > > > > realm-scoped
> > > > > > components (e.g., the "factory" beans that expose methods like
> > > > > > "getOrCreateXYZ") I think it's actually OK to receive the
> > > RealmContext
> > > > > as a
> > > > > > method parameter, as opposed to having the RealmContext injected.
> > > This
> > > > is
> > > > > > because the injection, while possible, would only work if the
> > request
> > > > > > context is active at the moment where a method on that bean is
> > > invoked.
> > > > > > This is hard to enforce at compile time and also hard to reason
> > about
> > > > > imho.
> > > > > > IOW I prefer to see this:
> > > > > >
> > > > > > @ApplicationScoped public class XYZFactory {
> > > > > >     public XYZ getOrCreateXYZ(RealmContext ctx) { return
> > > > > > doSomeStuff(ctx.getRealmIdentifier()); }
> > > > > > }
> > > > > >
> > > > > > Rather than this:
> > > > > >
> > > > > > @ApplicationScoped public class XYZFactory {
> > > > > >     @Inject RealmContext ctx;
> > > > > >     public XYZ getOrCreateXYZ() { return
> > > > > > doSomeStuff(ctx.getRealmIdentifier()); }
> > > > > > }
> > > > > >
> > > > > > 2) We do need a RealmContext parameter in the API classes generated
> > > > from
> > > > > > Mustache templates [1], because the RealmContext parameter is
> > > annotated
> > > > > as
> > > > > > follows:
> > > > > >
> > > > > > @Context @MeterTag(key="realm_id",expression="realmIdentifier")
> > > > > > RealmContext realmContext
> > > > > >
> > > > > > This is so that we can produce "realm_id" tags for API metrics.
> > Thus
> > > we
> > > > > > cannot remove it (unless there is another way to produce such tags,
> > > but
> > > > > I'm
> > > > > > not aware of any).
> > > > > >
> > > > > > But we could remove it from the Mustache templates for *service*
> > > > classes
> > > > > > [2]. This would effectively remove RealmContext as a method
> > parameter
> > > > > from
> > > > > > virtually all components down the call stack, thus forcing them to
> > > > inject
> > > > > > RealmContext wherever necessary.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Alex
> > > > > >
> > > > > > [1]:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://github.com/apache/polaris/blob/1baef0edec7dfd5d6440dc1d177a713c3b29055c/server-templates/api.mustache#L96
> > > > > > [2]:
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > https://github.com/apache/polaris/blob/1baef0edec7dfd5d6440dc1d177a713c3b29055c/server-templates/apiService.mustache#L39
> > > > > >
> > > > > > On Fri, Jun 6, 2025 at 3:49 AM Dmitri Bourlatchkov <
> > di...@apache.org
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > An example "skew" in current code: IcebergCatalogAdapter gets
> > > > > > RealmContext
> > > > > > > injected as a field, but also as a method parameters in
> > > > > > createNamespace().
> > > > > > >
> > > > > > > On Thu, Jun 5, 2025 at 8:35 PM Dmitri Bourlatchkov <
> > > di...@apache.org
> > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi All,
> > > > > > > >
> > > > > > > > Recently a few runtime issues related to the resolution of
> > > > > RealmContex
> > > > > > > got
> > > > > > > > fixed (mostly by Yun's PRs).
> > > > > > > >
> > > > > > > > These fixes center around the idea of passing RealmContext as a
> > > > > > parameter
> > > > > > > > across various method calls.
> > > > > > > >
> > > > > > > > That approach is effective, however, I tend to think it goes
> > > > against
> > > > > > the
> > > > > > > > grain of the dependency injection paradigm. The main point is
> > > that
> > > > > > realm
> > > > > > > is
> > > > > > > > a contextual concept in Polaris REST APIs (it is not a
> > parameter
> > > in
> > > > > any
> > > > > > > API
> > > > > > > > operations). Subsequently, all the service and core code logic
> > is
> > > > > > > normally
> > > > > > > > realm-neutral. That is to say that services operate without
> > > having
> > > > to
> > > > > > > know
> > > > > > > > what realm they are servicing.
> > > > > > > >
> > > > > > > > Realm IDs become prominent only at the persistence level (to
> > > ensure
> > > > > > > > isolation of realm data).
> > > > > > > >
> > > > > > > > I'd like to propose refactoring Polaris code to remove
> > > RealmContext
> > > > > > from
> > > > > > > > method parameters and keep it in instance fields only where it
> > is
> > > > > used.
> > > > > > > >
> > > > > > > > * MetaStore objects for REST API endpoints will be produced
> > for a
> > > > > > > > particular real with the request scope lifetime
> > > > > > > > * Tasks will operate in a similar way, where PR [1817]
> > provides a
> > > > POC
> > > > > > > > * The Quarkus runtime will use CDI for injection
> > > > > > > > * The polaris-core module will remain free from CDI
> > annotations,
> > > > but
> > > > > > some
> > > > > > > > changes will probably be required to allow injection via
> > > > > constructors.
> > > > > > > >
> > > > > > > > WDYT?
> > > > > > > >
> > > > > > > > If there are no strong objections up front, I can work on a
> > > bigger
> > > > > POC
> > > > > > PR
> > > > > > > > to allow reviewing / exploring this idea in more depth.
> > > > > > > >
> > > > > > > > [1817] https://github.com/apache/polaris/pull/1817
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > > Dmitri.
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >

Reply via email to