Great ! I will schedule ! Regards JB
On Thu, Feb 6, 2025 at 3:05 AM Dennis Huo <huoi...@gmail.com> wrote: > > > > > What about scheduling a meeting (community) specific to persistence ? > > I think it would be great to discuss. > > We can give time to anyone interested to read the proposals, and > > discuss next week ? > > > That works for me! Are you able to schedule it since you have the Google > Meet features? > > I propose to delegate this to the Persistence implementation and not expose > > anything about "indexes" in the interfaces used by Polaris services. > > > > I believe it is best to formulate the interface in terms of what queries > > need to be supported and let the implementation find the most appropriate > > way to do that in each particular case. > > > Agreed, that is also my proposal. I'm using the term "design" to include > both the interface design and the database-specific design since folks > seemed to have questions about database-specific details. > > I've structured the doc now so that you can ignore the database-level > proposals if you want. The interface says nothing about indexes (indeed, it > is fundamental to my proposal not to expose indexes, because FDB does not > have indexes). > > Maybe we can point to specific parts of the Java interface to have clarity > > For example, this is the interface for updating an entity from > PolarisMetaStoreManager: > > /** > * Update some properties of this entity assuming it can still be > resolved the same way and itself > * has not changed. If this is not the case we will return false. Else we > will update both the > * internal and visible properties and return true > * > * @param session the metastore session > * @param catalogPath path to that entity. Could be null if this entity > is top-level > * @param entity entity to update, cannot be null > * @return the entity we updated or null if the client should retry > */ > @Nonnull > EntityResult updateEntityPropertiesIfNotChanged( > @Nonnull PolarisMetaStoreSession session, > @Nullable List<PolarisEntityCore> catalogPath, > @Nonnull PolarisBaseEntity entity); > > I'd say "queries" imply a particular backend implementation as well -- we > should focus on the Java interface, and then the implementation could use a > "query" or other APIs as they see fit. > > On Wed, Feb 5, 2025 at 10:34 AM Jean-Baptiste Onofré <j...@nanthrax.net> > wrote: > > > Hi Dennis, > > > > What about scheduling a meeting (community) specific to persistence ? > > I think it would be great to discuss. > > We can give time to anyone interested to read the proposals, and > > discuss next week ? > > > > Maybe we can do it on Thursday, Feb 13, 9am PST (same slot as Polaris > > Community Meeting) ? > > > > Thanks > > Regards > > JB > > > > On Tue, Feb 4, 2025 at 8:02 PM Dennis Huo <huoi...@gmail.com> wrote: > > > > > > Hello all, > > > > > > We've had some discussions and github issues ( > > > https://github.com/apache/polaris/issues/775, > > > https://github.com/apache/polaris/issues/766, etc) scattered between > > > community syncs, slack threads, etc., related to how to adapt the Polaris > > > persistence layer to new DB backends, so I'm hoping we can consolidate > > > discussions towards an incremental path forward that is minimally > > invasive. > > > > > > I wrote this analysis of the persistence layer in the context of a couple > > > persistence backends that have been suggested (MongoDB, DynamoDB) and > > also > > > tried to retroactively clarify the current structure and intent of some > > of > > > the persistence internals: > > > > > > > > https://docs.google.com/document/d/1U9rprj8w8-Q0SnQvRMvoVlbX996z-89eOkVwWTQaZG0/edit?tab=t.0 > > > > > > It's a bit into the weeds, so will be most accessible for folks who have > > > taken a bit of a deep dive into the current > > > PolarisMetaStoreManager/PolarisMetaStoreSession layers. > > > > > > I'm not a MongoDB or DynamoDB expert though, so I'd appreciate any input > > to > > > help keep me honest on the capabilities :) > > > > > > At a high level, the biggest takeaway is that we *don't* need generalized > > > transactions, and some refactoring will allow us to have an abstract > > > top-level interface where 99% of use cases are covered either by: > > > > > > - Secondary indexes with "UNIQUE" constraints plus single-entity > > > Compare-and-Swap > > > - Multi-statement transactions for those that support it > > (FoundationDB, > > > Postgres) > > > > > > One notable outlier is Iceberg's "commitTransaction" API for multi-table > > > Iceberg transactions, which will require at least a "TransactBatch with > > > conditional writes per entity" semantic. We could evolve this one over > > > time, but on the plus side it seems this is still mostly supported by all > > > the mentioned backends so far. > >