I'd prefer to avoid having the APIs reflect the scope of the implementation
beans under the hood. On the other hand, I don't like forcing methods at
the bottom of the call stack requiring parameters because they're necessary
later on. I liked passing CallContext around because it didn't really
matter if the bean was ApplicationScoped or RequestScoped - CallContext was
just there. Passing the CallContext to the PolarisMetaStoreManager meant
that the underlying implementation could use an ApplicationScoped bean,
like the PolarisConfigurationStore and the current realm could be
determined from the method parameter. It's true, we could make everything
RequestScoped and then nothing needs to be passed in, but that feels
inefficient - e.g., should the PolarisConfigurationStore re-parse the
configuration map for every request?

A downside is that CallContext becomes the "everything bean" where you just
add properties because it's the easiest way to get a new parameter passed
to some bean 3 layers down, but I honestly think that's less of a concern
when the CDI context already allows for an easier way of injecting
dependencies.

Mike

On Mon, Jun 9, 2025 at 3:54 PM Dmitri Bourlatchkov <di...@apache.org> wrote:

> Thanks for stating your concerns, Yun!
>
> Re: MetastoreManager, I believe the current state of that code is actually
> the source of confusion and I'm trying to clarify that.
>
> For example, TransactionalMetaStoreManagerImpl is currently created per
> realm. However, this class does not carry any state. It is an empty shell
> that works off of the data passed as PolarisCallContext parameters. The
> same holds for AtomicOperationMetaStoreManager.
>
> As such, PolarisMetaStoreManager can be request-scoped beans. I do not see
> any reason to keep those objects after the request is done.
>
> WDYT?
>
> Thanks,
> Dmitri.
>
> On Mon, Jun 9, 2025 at 5:57 PM yun zou <yunzou.colost...@gmail.com> wrote:
>
> > Hi Dmitri,
> >
> > Thanks for bringing that up!
> >
> > I am not an expert with CDI, based my recent experience with CDI, while
> CDI
> > does help making development
> > simpler under many cases, it does also make things more complicated under
> > some scenarios. Especially when
> > injecting a request scoped object into application scoped class, it does
> > make things harder to reason about when
> > debugging, and it also seems we currently lack efficient ways of testing
> to
> > catch potential regressions. For those
> > cases, i think explicitly passing the objects as parameters is actually
> > more clear and robust.
> >
> > Furthermore, I think this could make things easier for new developers for
> > OSS polaris.
> >
> > Also, regarding the metastore object, i don't think i fully understand
> the
> > proposal, are we proposing to make the
> > MetastoreManager now request scoped? Currently, the metastore manager is
> > created per realm, not per
> > RealmContext, which means two requests with the same realmId can use the
> > same metastore manager. If we are turning
> > this class into a request scope, I am not sure if that would be a good
> > idea, especially when we are dealing with Entity Caching
> > optimizations, where caching per realm instead of per request probably
> > makes more sense.
> >
> > Best Regards,
> > Yun
> >
> >
> >
> > On Fri, Jun 6, 2025 at 2:47 PM Yufei Gu <flyrain...@gmail.com> wrote:
> >
> > > Agreed with Alex. I think passing a realm context is a cleaner solution
> > > than injecting a request scope realm context. CDI provides a lot of
> > > benefits, but it doesn't mean that we should alway use it.
> > >
> > > In general, injecting a request-scoped bean into an application-scoped
> > bean
> > > is perfectly valid in CDI, but it does come with a few gotchas:
> > >
> > >    1. Lifecycle surprises. An @ApplicationScoped method can run in
> > contexts
> > >    where no request scope is active (e.g., async tasks, scheduled
> jobs).
> > In
> > >    those cases the injected proxy can’t resolve a real instance,
> leading
> > to
> > >    ContextNotActiveException or, worse, stale data.
> > >    2. Debugging pain, When failures depend on the presence or absence
> of
> > a
> > >    request context, stack traces alone rarely tell the full story.
> > Tracking
> > >    down intermittent bugs becomes tricky.
> > >    3. Unnecessary complexity, which is a contradiction of using
> > injection.
> > >    Injection is supposed to make a simpler and cleaner code base. If
> you
> > >    really only need a bit of request data, passing it as a method
> > parameter
> > >    keeps the code simpler, avoids proxy lookups, and makes unit tests
> > > cleaner.
> > >
> > > As a rule of thumb, we may inject a narrower scope only when we must.
> > >
> > > Yufei
> > >
> > >
> > > On Fri, Jun 6, 2025 at 11:15 AM Dmitri Bourlatchkov <di...@apache.org>
> > > wrote:
> > >
> > > > Thanks, Alex! Very good points!
> > > >
> > > > Re: metrics, I did a quick POC by adding `@Inject RealmContext rc`
> > > > to QuarkusValueExpressionResolver and it appears to work. That
> > > > is, QuarkusValueExpressionResolver can get the realm ID from the
> > injected
> > > > RealmContext as opposed to the API parameter.
> > > >
> > > > With that, we could probably change the interpretation of the
> > > > "realmIdentifier" expression to be the injected RealmContext, remove
> > > > explicit realm parameters from the generated API classes and move the
> > > > `@MeterTag(key="realm_id",expression="realmIdentifier")` annotation
> to
> > > the
> > > > method level.
> > > >
> > > > WDYT?
> > > >
> > > > Thanks,
> > > > Dmitri.
> > > >
> > > > On Fri, Jun 6, 2025 at 12:41 PM Alex Dutra
> > <alex.du...@dremio.com.invalid
> > > >
> > > > wrote:
> > > >
> > > > > Hi Dmitri,
> > > > >
> > > > > Thanks for bringing up this important topic.
> > > > >
> > > > > I generally agree with your refactoring proposal, but with 2
> caveats:
> > > > >
> > > > > 1) For an application-scoped bean exposing methods to retrieve
> > > > realm-scoped
> > > > > components (e.g., the "factory" beans that expose methods like
> > > > > "getOrCreateXYZ") I think it's actually OK to receive the
> > RealmContext
> > > > as a
> > > > > method parameter, as opposed to having the RealmContext injected.
> > This
> > > is
> > > > > because the injection, while possible, would only work if the
> request
> > > > > context is active at the moment where a method on that bean is
> > invoked.
> > > > > This is hard to enforce at compile time and also hard to reason
> about
> > > > imho.
> > > > > IOW I prefer to see this:
> > > > >
> > > > > @ApplicationScoped public class XYZFactory {
> > > > >     public XYZ getOrCreateXYZ(RealmContext ctx) { return
> > > > > doSomeStuff(ctx.getRealmIdentifier()); }
> > > > > }
> > > > >
> > > > > Rather than this:
> > > > >
> > > > > @ApplicationScoped public class XYZFactory {
> > > > >     @Inject RealmContext ctx;
> > > > >     public XYZ getOrCreateXYZ() { return
> > > > > doSomeStuff(ctx.getRealmIdentifier()); }
> > > > > }
> > > > >
> > > > > 2) We do need a RealmContext parameter in the API classes generated
> > > from
> > > > > Mustache templates [1], because the RealmContext parameter is
> > annotated
> > > > as
> > > > > follows:
> > > > >
> > > > > @Context @MeterTag(key="realm_id",expression="realmIdentifier")
> > > > > RealmContext realmContext
> > > > >
> > > > > This is so that we can produce "realm_id" tags for API metrics.
> Thus
> > we
> > > > > cannot remove it (unless there is another way to produce such tags,
> > but
> > > > I'm
> > > > > not aware of any).
> > > > >
> > > > > But we could remove it from the Mustache templates for *service*
> > > classes
> > > > > [2]. This would effectively remove RealmContext as a method
> parameter
> > > > from
> > > > > virtually all components down the call stack, thus forcing them to
> > > inject
> > > > > RealmContext wherever necessary.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Alex
> > > > >
> > > > > [1]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/1baef0edec7dfd5d6440dc1d177a713c3b29055c/server-templates/api.mustache#L96
> > > > > [2]:
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/1baef0edec7dfd5d6440dc1d177a713c3b29055c/server-templates/apiService.mustache#L39
> > > > >
> > > > > On Fri, Jun 6, 2025 at 3:49 AM Dmitri Bourlatchkov <
> di...@apache.org
> > >
> > > > > wrote:
> > > > >
> > > > > > An example "skew" in current code: IcebergCatalogAdapter gets
> > > > > RealmContext
> > > > > > injected as a field, but also as a method parameters in
> > > > > createNamespace().
> > > > > >
> > > > > > On Thu, Jun 5, 2025 at 8:35 PM Dmitri Bourlatchkov <
> > di...@apache.org
> > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi All,
> > > > > > >
> > > > > > > Recently a few runtime issues related to the resolution of
> > > > RealmContex
> > > > > > got
> > > > > > > fixed (mostly by Yun's PRs).
> > > > > > >
> > > > > > > These fixes center around the idea of passing RealmContext as a
> > > > > parameter
> > > > > > > across various method calls.
> > > > > > >
> > > > > > > That approach is effective, however, I tend to think it goes
> > > against
> > > > > the
> > > > > > > grain of the dependency injection paradigm. The main point is
> > that
> > > > > realm
> > > > > > is
> > > > > > > a contextual concept in Polaris REST APIs (it is not a
> parameter
> > in
> > > > any
> > > > > > API
> > > > > > > operations). Subsequently, all the service and core code logic
> is
> > > > > > normally
> > > > > > > realm-neutral. That is to say that services operate without
> > having
> > > to
> > > > > > know
> > > > > > > what realm they are servicing.
> > > > > > >
> > > > > > > Realm IDs become prominent only at the persistence level (to
> > ensure
> > > > > > > isolation of realm data).
> > > > > > >
> > > > > > > I'd like to propose refactoring Polaris code to remove
> > RealmContext
> > > > > from
> > > > > > > method parameters and keep it in instance fields only where it
> is
> > > > used.
> > > > > > >
> > > > > > > * MetaStore objects for REST API endpoints will be produced
> for a
> > > > > > > particular real with the request scope lifetime
> > > > > > > * Tasks will operate in a similar way, where PR [1817]
> provides a
> > > POC
> > > > > > > * The Quarkus runtime will use CDI for injection
> > > > > > > * The polaris-core module will remain free from CDI
> annotations,
> > > but
> > > > > some
> > > > > > > changes will probably be required to allow injection via
> > > > constructors.
> > > > > > >
> > > > > > > WDYT?
> > > > > > >
> > > > > > > If there are no strong objections up front, I can work on a
> > bigger
> > > > POC
> > > > > PR
> > > > > > > to allow reviewing / exploring this idea in more depth.
> > > > > > >
> > > > > > > [1817] https://github.com/apache/polaris/pull/1817
> > > > > > >
> > > > > > > Cheers,
> > > > > > > Dmitri.
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to