Hi Jonas,

thanks for sharing your thoughts and sorry for the late response.

Let me try to address your second point first: According to the current "MetaStoreManagerFactory.getOrCreateMetaStoreManager" api and implementations the "PolarisMetaStoreManager" isn't really application-scope but more like "realm-scoped", as repeated calls for the same realm are returning the same instance. If we look into "ServiceProducers" on the "main" branch the "PolarisMetaStoreManager" is kind of request-scoped already in the quarkus environment (i.e. when handling http requests). This shows that the "MetaStoreManagerFactory" is free to decide how to build the"PolarisMetaStoreManager" instance
it returns, currently it "caches" one instance per realm.
So even with "PolarisMetaStoreManager" becoming request-scoped from the callers 
perspective, the "MetaStoreManagerFactory"
is still free to do as much state sharing as it thinks is necessary under the 
hood.
As for the current implementations in the codebase a notable change with the 
proposal is that there is a new
"BasePersistence" instance per factory call.
But since "BasePersistence" is becoming an implementation detail of the 
factory, other custom factories are
free to deviate from this, as long as the returned "PolarisMetaStoreManager" 
instance represents a usable
"metastore session" for users of the api.

In your described scenario there would be an application-scoped bean for the 
HTTP client pool that would be available
in the custom "MetaStoreManagerFactory" and when "createMetaStoreManager" gets 
called, it would have free control over
how/when an HTTP client gets created or reused.

Let me try to show a simpler example that might make this flexibility more 
obvious:
https://github.com/apache/polaris/pull/2340/files#r2272428785
In this PR we made "PolarisAuthorizer" request-scoped with the benefit that all 
users of that interface could be simplified.
The creation of "PolarisAuthorizer" in "ServiceProducers" still allows passing 
in any application- or request-scoped bean
depending on what the actual implementation needs.
In the case of "PolarisAuthorizerImpl" only the "RealmConfig" is needed.
Custom implementations of "PolarisAuthorizer" could still pass in the 
"CallContext" if they need it.

The general idea is to simplify the users of these interface as much as 
possible with the trade-off to move most of the
complexity and customization-capabilities to object creation time. The difference of 
"PolarisMetaStoreManager" compared
to "PolarisAuthorizer" is that we are currently somewhat constrained by the existing 
"MetaStoreManagerFactory" api, as
there are multiple callers of "getOrCreateMetaStoreManager" and only a 
"RealmContext" parameter is provided.
The currently proposed PR aims to reduce these call sites to the minimum.
But as stated above, since "MetaStoreManagerFactory" is application-scoped it 
is free to use any application-scoped bean
or derive anything from those beans in combination with the given 
"RealmContext" parameter when constructing the
"PolarisMetaStoreManager" instance.
In that sense the proposal could also be renamed to "letPolarisMetaStoreManager 
alone represent the persistence session"
(as currently you need both "PolarisMetaStoreManager" and 
"CallContext.getMetaStore" to access the persistence layer)

This brings me to your first point.
You make a good argument that the proposal PR in its current form would no longer allow 
to have custom "CallContext" classes
that can pass additional request-specific information into the persistence 
layer.
In the polaris codebase this is not obvious as "PolarisCallContext" is now 
quite thin and doesn't do this, but your point
still stands for other codebases.

I've put up another PR that is mostly based on the previous one:
https://github.com/apache/polaris/pull/2833
but the 2nd commit changes the parameter of 
"MetaStoreManagerFactory.createMetaStoreManager" to be "CallContext" instead of 
just
"RealmContext". Most call sites can handle this easily but only some (i.e. tasks) need to 
build a new "CallContext" from their
local "RealmContext".

Would you agree that this addresses your first concern ?

As it came up in this email thread one other thing I would like to note is that 
the proposal PR is not supposed to rely on
proxies for using request-scoped beans in any application-scoped beans. I 
consider this to be a bad practice and if you
can point out any place where the PR does this I will gladly work on fixing 
that.

Afaict most (all?) of the use spots work naturally with a request-scoped 
"PolarisMetaStoreManager" as the classes are already
request-scoped themselves. One way to think about it is that if the object has a 
"CallContext" field it means the object
only exists within the handling of a single "request" already. The object's 
lifetime is request-scoped.

Let me know what you think,
Christopher

On 09.10.25 01:39, Honah J. wrote:
Thanks for the proposal and the effort of this refactoring, Christopher!
Really appreciate the detailed summary of pros and cons in the document

I share some of Michael’s concerns and wanted to add a bit more perspective:

1. Removal of CallContext from PolarisMetastoreManager
As also mentioned in the proposal, removing CallContext completely means
that custom implementations can no longer pass additional request-specific
information into the persistence layer.Since all requests eventually go
through the persistence APIs, allowing request-specific or custom metadata
to be passed in could help keep the interface more extensible and adaptable
for future use cases.

2. Making PolarisMetaStoreManager request-scoped
IMHO, one advantage of keeping the PolarisMetaStoreManager
application-scoped is that it can hold state independent of individual
requests. For example, if someone wants to back the metastore with a
REST-based persistence service, the manager could maintain a shared HTTP
client pool at the application level and serialize operations over REST.
We’ve already seen similar REST-based integration patterns, such as in
OpaPolarisAuthorizer. In general, I think interfaces like persistence,
auth, and authorizer would benefit from remaining framework-agnostic and
flexible for REST-style integrations — where parameters carry all
request-specific information, while the instance itself holds
request-independent state for optimization.

Love to hear others' thoughts on this!

Best Regards,
Jonas

Reply via email to