Hi Jonas, thanks for sharing your thoughts and sorry for the late response.
Let me try to address your second point first: According to the current "MetaStoreManagerFactory.getOrCreateMetaStoreManager" api and implementations the "PolarisMetaStoreManager" isn't really application-scope but more like "realm-scoped", as repeated calls for the same realm are returning the same instance. If we look into "ServiceProducers" on the "main" branch the "PolarisMetaStoreManager" is kind of request-scoped already in the quarkus environment (i.e. when handling http requests). This shows that the "MetaStoreManagerFactory" is free to decide how to build the"PolarisMetaStoreManager" instance
it returns, currently it "caches" one instance per realm. So even with "PolarisMetaStoreManager" becoming request-scoped from the callers perspective, the "MetaStoreManagerFactory" is still free to do as much state sharing as it thinks is necessary under the hood. As for the current implementations in the codebase a notable change with the proposal is that there is a new "BasePersistence" instance per factory call. But since "BasePersistence" is becoming an implementation detail of the factory, other custom factories are free to deviate from this, as long as the returned "PolarisMetaStoreManager" instance represents a usable "metastore session" for users of the api. In your described scenario there would be an application-scoped bean for the HTTP client pool that would be available in the custom "MetaStoreManagerFactory" and when "createMetaStoreManager" gets called, it would have free control over how/when an HTTP client gets created or reused. Let me try to show a simpler example that might make this flexibility more obvious: https://github.com/apache/polaris/pull/2340/files#r2272428785 In this PR we made "PolarisAuthorizer" request-scoped with the benefit that all users of that interface could be simplified. The creation of "PolarisAuthorizer" in "ServiceProducers" still allows passing in any application- or request-scoped bean depending on what the actual implementation needs. In the case of "PolarisAuthorizerImpl" only the "RealmConfig" is needed. Custom implementations of "PolarisAuthorizer" could still pass in the "CallContext" if they need it. The general idea is to simplify the users of these interface as much as possible with the trade-off to move most of the complexity and customization-capabilities to object creation time. The difference of "PolarisMetaStoreManager" compared to "PolarisAuthorizer" is that we are currently somewhat constrained by the existing "MetaStoreManagerFactory" api, as there are multiple callers of "getOrCreateMetaStoreManager" and only a "RealmContext" parameter is provided. The currently proposed PR aims to reduce these call sites to the minimum. But as stated above, since "MetaStoreManagerFactory" is application-scoped it is free to use any application-scoped bean or derive anything from those beans in combination with the given "RealmContext" parameter when constructing the "PolarisMetaStoreManager" instance. In that sense the proposal could also be renamed to "letPolarisMetaStoreManager alone represent the persistence session" (as currently you need both "PolarisMetaStoreManager" and "CallContext.getMetaStore" to access the persistence layer) This brings me to your first point. You make a good argument that the proposal PR in its current form would no longer allow to have custom "CallContext" classes that can pass additional request-specific information into the persistence layer. In the polaris codebase this is not obvious as "PolarisCallContext" is now quite thin and doesn't do this, but your point still stands for other codebases. I've put up another PR that is mostly based on the previous one: https://github.com/apache/polaris/pull/2833 but the 2nd commit changes the parameter of "MetaStoreManagerFactory.createMetaStoreManager" to be "CallContext" instead of just "RealmContext". Most call sites can handle this easily but only some (i.e. tasks) need to build a new "CallContext" from their local "RealmContext". Would you agree that this addresses your first concern ? As it came up in this email thread one other thing I would like to note is that the proposal PR is not supposed to rely on proxies for using request-scoped beans in any application-scoped beans. I consider this to be a bad practice and if you can point out any place where the PR does this I will gladly work on fixing that. Afaict most (all?) of the use spots work naturally with a request-scoped "PolarisMetaStoreManager" as the classes are already request-scoped themselves. One way to think about it is that if the object has a "CallContext" field it means the object only exists within the handling of a single "request" already. The object's lifetime is request-scoped. Let me know what you think, Christopher On 09.10.25 01:39, Honah J. wrote:
Thanks for the proposal and the effort of this refactoring, Christopher! Really appreciate the detailed summary of pros and cons in the document I share some of Michael’s concerns and wanted to add a bit more perspective: 1. Removal of CallContext from PolarisMetastoreManager As also mentioned in the proposal, removing CallContext completely means that custom implementations can no longer pass additional request-specific information into the persistence layer.Since all requests eventually go through the persistence APIs, allowing request-specific or custom metadata to be passed in could help keep the interface more extensible and adaptable for future use cases. 2. Making PolarisMetaStoreManager request-scoped IMHO, one advantage of keeping the PolarisMetaStoreManager application-scoped is that it can hold state independent of individual requests. For example, if someone wants to back the metastore with a REST-based persistence service, the manager could maintain a shared HTTP client pool at the application level and serialize operations over REST. We’ve already seen similar REST-based integration patterns, such as in OpaPolarisAuthorizer. In general, I think interfaces like persistence, auth, and authorizer would benefit from remaining framework-agnostic and flexible for REST-style integrations — where parameters carry all request-specific information, while the instance itself holds request-independent state for optimization. Love to hear others' thoughts on this! Best Regards, Jonas
