Hi Dennis Thanks for starting this thread ! I will read the resources you mentioned.
As I also started some design proposal doc with Jack, I'm linking to this thread (it's a WIP): https://docs.google.com/document/d/1LlNhEy4cBjjE_um694fcsnizqd3rDm5pbewXkLxvu1o/edit?usp=sharing I will continue the code "illustration" tomorrow (my time). Regards JB On Tue, Feb 4, 2025 at 8:02 PM Dennis Huo <huoi...@gmail.com> wrote: > > Hello all, > > We've had some discussions and github issues ( > https://github.com/apache/polaris/issues/775, > https://github.com/apache/polaris/issues/766, etc) scattered between > community syncs, slack threads, etc., related to how to adapt the Polaris > persistence layer to new DB backends, so I'm hoping we can consolidate > discussions towards an incremental path forward that is minimally invasive. > > I wrote this analysis of the persistence layer in the context of a couple > persistence backends that have been suggested (MongoDB, DynamoDB) and also > tried to retroactively clarify the current structure and intent of some of > the persistence internals: > > https://docs.google.com/document/d/1U9rprj8w8-Q0SnQvRMvoVlbX996z-89eOkVwWTQaZG0/edit?tab=t.0 > > It's a bit into the weeds, so will be most accessible for folks who have > taken a bit of a deep dive into the current > PolarisMetaStoreManager/PolarisMetaStoreSession layers. > > I'm not a MongoDB or DynamoDB expert though, so I'd appreciate any input to > help keep me honest on the capabilities :) > > At a high level, the biggest takeaway is that we *don't* need generalized > transactions, and some refactoring will allow us to have an abstract > top-level interface where 99% of use cases are covered either by: > > - Secondary indexes with "UNIQUE" constraints plus single-entity > Compare-and-Swap > - Multi-statement transactions for those that support it (FoundationDB, > Postgres) > > One notable outlier is Iceberg's "commitTransaction" API for multi-table > Iceberg transactions, which will require at least a "TransactBatch with > conditional writes per entity" semantic. We could evolve this one over > time, but on the plus side it seems this is still mostly supported by all > the mentioned backends so far.