I think this is a great idea. Even if we put aside the NoSQL / RDBMS point, simply clarifying the roles & responsibilities of the persistence interface(s) would be a welcome improvement.
--EM On Tue, Dec 3, 2024 at 5:57 AM Dmitri Bourlatchkov <dmitri.bourlatch...@dremio.com.invalid> wrote: > Hi All, > > I believe it was already discussed elsewhere that it is valuable to allow > Apache Polaris to be extensible, and in particular extensible in how it > interacts with its own Persistence backend (not to be confused with Iceberg > data storage). > > I’d like to formalize the expectations Polaris Core has on Persistence. I > think it will be extremely valuable for contributors wishing to add support > for backends beyond the current EclipseLink implementation. > > Currently, the closest abstraction layer for Persistence appears to be > PolarisMetaStoreManager, however this interface combines a few other > interfaces, not directly related to persistence per se and having different > concerns: > > > - > > The Grant Manager > - > > Remote Cache > - > > Secrets Manager > - > > Credential Vendor > > > I’d like to propose: > > 1. > > Interface delineation. Split off a “pure” persistence SPI that does not > directly deal with grants or caching, but could be used by the grant > manager and by caches in their respective contexts. Many of the > PolarisMetaStoreManager sub-interfaces are not related to persistence. > Once > isolated, they will be outside the scope of this discussion. > 2. > > Bootstrapping. This should probably be an external concern implemented > generically for any Persistence implementation. > 3. > > Consistency guarantees. Catalog API implementations have to perform > several changes as part of one logical transaction (e.g. multi-table > commit). Several servers acting in a distributed system on the same > backend > should know what consistency expectations they can have on the > Persistence > layer in order to function correctly. I think these guarantees should be > stated explicitly in java or .md docs for the sake of clarity. > 4. > > Transactions. I believe that it would be valuable to avoid specifically > binding to the RDBMS transaction concept and if possible formulate the > Persistence SPI in a way that could be mapped to RDBMS as well as to a > NoSQL backend. > > > Please share your thoughts on this. > > Thanks, > > Dmitri. >