Nice document, Dennis! Thanks for putting it together. I find it very useful.
I added a lot of comments, not as critique, but hoping to achieve clarity. One thing which is missing from the doc, IMHO, is the discussion of handling concurrent reads / writes made by different instances of the Polaris Server. Would you be able to elaborate on that? Thanks, Dmitri. On Tue, Feb 4, 2025 at 2:03 PM Dennis Huo <huoi...@gmail.com> wrote: > Hello all, > > We've had some discussions and github issues ( > https://github.com/apache/polaris/issues/775, > https://github.com/apache/polaris/issues/766, etc) scattered between > community syncs, slack threads, etc., related to how to adapt the Polaris > persistence layer to new DB backends, so I'm hoping we can consolidate > discussions towards an incremental path forward that is minimally invasive. > > I wrote this analysis of the persistence layer in the context of a couple > persistence backends that have been suggested (MongoDB, DynamoDB) and also > tried to retroactively clarify the current structure and intent of some of > the persistence internals: > > > https://docs.google.com/document/d/1U9rprj8w8-Q0SnQvRMvoVlbX996z-89eOkVwWTQaZG0/edit?tab=t.0 > > It's a bit into the weeds, so will be most accessible for folks who have > taken a bit of a deep dive into the current > PolarisMetaStoreManager/PolarisMetaStoreSession layers. > > I'm not a MongoDB or DynamoDB expert though, so I'd appreciate any input to > help keep me honest on the capabilities :) > > At a high level, the biggest takeaway is that we *don't* need generalized > transactions, and some refactoring will allow us to have an abstract > top-level interface where 99% of use cases are covered either by: > > - Secondary indexes with "UNIQUE" constraints plus single-entity > Compare-and-Swap > - Multi-statement transactions for those that support it (FoundationDB, > Postgres) > > One notable outlier is Iceberg's "commitTransaction" API for multi-table > Iceberg transactions, which will require at least a "TransactBatch with > conditional writes per entity" semantic. We could evolve this one over > time, but on the plus side it seems this is still mostly supported by all > the mentioned backends so far. >