Thanks for stating your concerns, Yun! Re: MetastoreManager, I believe the current state of that code is actually the source of confusion and I'm trying to clarify that.
For example, TransactionalMetaStoreManagerImpl is currently created per realm. However, this class does not carry any state. It is an empty shell that works off of the data passed as PolarisCallContext parameters. The same holds for AtomicOperationMetaStoreManager. As such, PolarisMetaStoreManager can be request-scoped beans. I do not see any reason to keep those objects after the request is done. WDYT? Thanks, Dmitri. On Mon, Jun 9, 2025 at 5:57 PM yun zou <yunzou.colost...@gmail.com> wrote: > Hi Dmitri, > > Thanks for bringing that up! > > I am not an expert with CDI, based my recent experience with CDI, while CDI > does help making development > simpler under many cases, it does also make things more complicated under > some scenarios. Especially when > injecting a request scoped object into application scoped class, it does > make things harder to reason about when > debugging, and it also seems we currently lack efficient ways of testing to > catch potential regressions. For those > cases, i think explicitly passing the objects as parameters is actually > more clear and robust. > > Furthermore, I think this could make things easier for new developers for > OSS polaris. > > Also, regarding the metastore object, i don't think i fully understand the > proposal, are we proposing to make the > MetastoreManager now request scoped? Currently, the metastore manager is > created per realm, not per > RealmContext, which means two requests with the same realmId can use the > same metastore manager. If we are turning > this class into a request scope, I am not sure if that would be a good > idea, especially when we are dealing with Entity Caching > optimizations, where caching per realm instead of per request probably > makes more sense. > > Best Regards, > Yun > > > > On Fri, Jun 6, 2025 at 2:47 PM Yufei Gu <flyrain...@gmail.com> wrote: > > > Agreed with Alex. I think passing a realm context is a cleaner solution > > than injecting a request scope realm context. CDI provides a lot of > > benefits, but it doesn't mean that we should alway use it. > > > > In general, injecting a request-scoped bean into an application-scoped > bean > > is perfectly valid in CDI, but it does come with a few gotchas: > > > > 1. Lifecycle surprises. An @ApplicationScoped method can run in > contexts > > where no request scope is active (e.g., async tasks, scheduled jobs). > In > > those cases the injected proxy can’t resolve a real instance, leading > to > > ContextNotActiveException or, worse, stale data. > > 2. Debugging pain, When failures depend on the presence or absence of > a > > request context, stack traces alone rarely tell the full story. > Tracking > > down intermittent bugs becomes tricky. > > 3. Unnecessary complexity, which is a contradiction of using > injection. > > Injection is supposed to make a simpler and cleaner code base. If you > > really only need a bit of request data, passing it as a method > parameter > > keeps the code simpler, avoids proxy lookups, and makes unit tests > > cleaner. > > > > As a rule of thumb, we may inject a narrower scope only when we must. > > > > Yufei > > > > > > On Fri, Jun 6, 2025 at 11:15 AM Dmitri Bourlatchkov <di...@apache.org> > > wrote: > > > > > Thanks, Alex! Very good points! > > > > > > Re: metrics, I did a quick POC by adding `@Inject RealmContext rc` > > > to QuarkusValueExpressionResolver and it appears to work. That > > > is, QuarkusValueExpressionResolver can get the realm ID from the > injected > > > RealmContext as opposed to the API parameter. > > > > > > With that, we could probably change the interpretation of the > > > "realmIdentifier" expression to be the injected RealmContext, remove > > > explicit realm parameters from the generated API classes and move the > > > `@MeterTag(key="realm_id",expression="realmIdentifier")` annotation to > > the > > > method level. > > > > > > WDYT? > > > > > > Thanks, > > > Dmitri. > > > > > > On Fri, Jun 6, 2025 at 12:41 PM Alex Dutra > <alex.du...@dremio.com.invalid > > > > > > wrote: > > > > > > > Hi Dmitri, > > > > > > > > Thanks for bringing up this important topic. > > > > > > > > I generally agree with your refactoring proposal, but with 2 caveats: > > > > > > > > 1) For an application-scoped bean exposing methods to retrieve > > > realm-scoped > > > > components (e.g., the "factory" beans that expose methods like > > > > "getOrCreateXYZ") I think it's actually OK to receive the > RealmContext > > > as a > > > > method parameter, as opposed to having the RealmContext injected. > This > > is > > > > because the injection, while possible, would only work if the request > > > > context is active at the moment where a method on that bean is > invoked. > > > > This is hard to enforce at compile time and also hard to reason about > > > imho. > > > > IOW I prefer to see this: > > > > > > > > @ApplicationScoped public class XYZFactory { > > > > public XYZ getOrCreateXYZ(RealmContext ctx) { return > > > > doSomeStuff(ctx.getRealmIdentifier()); } > > > > } > > > > > > > > Rather than this: > > > > > > > > @ApplicationScoped public class XYZFactory { > > > > @Inject RealmContext ctx; > > > > public XYZ getOrCreateXYZ() { return > > > > doSomeStuff(ctx.getRealmIdentifier()); } > > > > } > > > > > > > > 2) We do need a RealmContext parameter in the API classes generated > > from > > > > Mustache templates [1], because the RealmContext parameter is > annotated > > > as > > > > follows: > > > > > > > > @Context @MeterTag(key="realm_id",expression="realmIdentifier") > > > > RealmContext realmContext > > > > > > > > This is so that we can produce "realm_id" tags for API metrics. Thus > we > > > > cannot remove it (unless there is another way to produce such tags, > but > > > I'm > > > > not aware of any). > > > > > > > > But we could remove it from the Mustache templates for *service* > > classes > > > > [2]. This would effectively remove RealmContext as a method parameter > > > from > > > > virtually all components down the call stack, thus forcing them to > > inject > > > > RealmContext wherever necessary. > > > > > > > > Thanks, > > > > > > > > Alex > > > > > > > > [1]: > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/1baef0edec7dfd5d6440dc1d177a713c3b29055c/server-templates/api.mustache#L96 > > > > [2]: > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/1baef0edec7dfd5d6440dc1d177a713c3b29055c/server-templates/apiService.mustache#L39 > > > > > > > > On Fri, Jun 6, 2025 at 3:49 AM Dmitri Bourlatchkov <di...@apache.org > > > > > > wrote: > > > > > > > > > An example "skew" in current code: IcebergCatalogAdapter gets > > > > RealmContext > > > > > injected as a field, but also as a method parameters in > > > > createNamespace(). > > > > > > > > > > On Thu, Jun 5, 2025 at 8:35 PM Dmitri Bourlatchkov < > di...@apache.org > > > > > > > > wrote: > > > > > > > > > > > Hi All, > > > > > > > > > > > > Recently a few runtime issues related to the resolution of > > > RealmContex > > > > > got > > > > > > fixed (mostly by Yun's PRs). > > > > > > > > > > > > These fixes center around the idea of passing RealmContext as a > > > > parameter > > > > > > across various method calls. > > > > > > > > > > > > That approach is effective, however, I tend to think it goes > > against > > > > the > > > > > > grain of the dependency injection paradigm. The main point is > that > > > > realm > > > > > is > > > > > > a contextual concept in Polaris REST APIs (it is not a parameter > in > > > any > > > > > API > > > > > > operations). Subsequently, all the service and core code logic is > > > > > normally > > > > > > realm-neutral. That is to say that services operate without > having > > to > > > > > know > > > > > > what realm they are servicing. > > > > > > > > > > > > Realm IDs become prominent only at the persistence level (to > ensure > > > > > > isolation of realm data). > > > > > > > > > > > > I'd like to propose refactoring Polaris code to remove > RealmContext > > > > from > > > > > > method parameters and keep it in instance fields only where it is > > > used. > > > > > > > > > > > > * MetaStore objects for REST API endpoints will be produced for a > > > > > > particular real with the request scope lifetime > > > > > > * Tasks will operate in a similar way, where PR [1817] provides a > > POC > > > > > > * The Quarkus runtime will use CDI for injection > > > > > > * The polaris-core module will remain free from CDI annotations, > > but > > > > some > > > > > > changes will probably be required to allow injection via > > > constructors. > > > > > > > > > > > > WDYT? > > > > > > > > > > > > If there are no strong objections up front, I can work on a > bigger > > > POC > > > > PR > > > > > > to allow reviewing / exploring this idea in more depth. > > > > > > > > > > > > [1817] https://github.com/apache/polaris/pull/1817 > > > > > > > > > > > > Cheers, > > > > > > Dmitri. > > > > > > > > > > > > > > > > > > > > >