Hi Prashant If JsonB is PostgreSQL/JDBC specific, it makes sense to me as it's implementation specific., so +1 from my side. We should stay "abstract" in BasePersistence (and other interfaces).
If backward compatibility is good to have, it's best effort. Same for zero downtime. We can do our best, but if it's too much constraint, then we can document it. Regards JB On Tue, Apr 1, 2025 at 5:43 PM Prashant Singh <prashant.si...@snowflake.com.invalid> wrote: > > Thank you Dennis and Eric. > > I agree whenever we introduce (2) in BasePersistence we should definitely > re:think if this is a query pattern > that can be supported by most of the DBs (Relational / NoSQL) before > accepting it for BasePersistence. > Presently this specific change is not gonna introduce any new query > pattern, so I think that should be fine. > Given that BasePersistence does not change it's actually an implementation > detail on how we store and query the underlying objects within > BasePersistence impl. > > Regarding the zero downtime, 100% backward compatibility goal, I think we > are on the same page that it was our best effort. > > IMHO JSON B is the right thing to do as the underlying object is actually a > JSON, for even debugging purposes, or telemetry, presently a user debugging > some field in their prop will have to do a string search for the text. > > Considering the above please let me know what your thoughts are ? > > Best, > Prashant Singh > > On Sat, Mar 29, 2025 at 1:16 AM Eric Maynard <eric.w.mayn...@gmail.com> > wrote: > > > I'm generally a +1 on using JSONB within Postgres. > > > > However I am in agreement with Dennis that we should avoid his item (2) > > above. If the application will need to load entities by some attributes A, > > B, and C then we should create methods loadEntityByA(a: A), > > loadEntityByB(b: B), and loadEntityByC(c: C) rather than writing some > > catch-all loadEntityByPredicate(?: ?). In this way, it will be clearly > > defined what the access patterns to the metastore are expected to be and > > what the contract metastore implementations are expected to fulfill is. > > > > Whether the postgres metastore decides to implement loadEntityByC using a > > column C or by using a query against a JSON blob, I suppose that we can > > make that evaluation on a case-by-case basis. I think at least having a > > JSONB column seems reasonable. > > > > On Fri, Mar 21, 2025 at 3:40 PM Dennis Huo <huoi...@gmail.com> wrote: > > > > > Re: "non-transactional", I guess it's more descriptive to refer to this > > new > > > implementation as "AtomicOperationJdbcPersistence" if it's confusing to > > say > > > "non-transactional"; the key is simply that we don't expose any > > > "runInTransaction" to the AtomicOperationMetaStoreManager layer. > > > > > > Regarding Robert's question about "same schema", the important > > distinction > > > is that all the prior discussions about "ease of migration" are more > > > focused on in-place *unidirectional schema compatibility*, not > > necessarily > > > *exact same schema*. > > > > > > This distinction is important because it's true that the current > > > EclipseLink implementation's three tightly-coupled tables ModelEntity, > > > ModelEntityActive, and ModelEntityChangeTracking indeed are not all > > carried > > > over to the "AtomicOperationJdbcPersistence" implementation, and that > > > keeping those three tables means keeping inefficient transaction > > behaviors. > > > > > > However, keeping "ModelEntity" but adding a secondary index on it allows > > > unidirectional compatibility (migrating from old EclipseLink -> new > > > AtomicOperationJdbcPersistence), without still keeping those three > > > tightly-coupled tables. > > > > > > I suppose it's also true though that backwards-compatibility right now is > > > kind of "best effort" at this stage in the project; if we do discover > > > unforeseen serious shortcomings in trying to keep the "ModelEntity" table > > > the same, it's reasonable to consider giving existing users a different > > > migration path and defining a totally different schema if needed. But by > > > default it's "nice" if it happens to be easy to preserve that > > compatibility > > > > > > Regarding JSONB, on the one hand, self-contained optimizations within a > > > given persistence implementation (e.g. silently re-serialize any > > retrieved > > > JSONB-backed JsonObjects into a plain JSON String to keep the current > > > PolarisBaseEntity interface unchanged) seem safe enough but we wouldn't > > > really reap much benefit (other than maybe a slight generalized > > performance > > > improvement) without either: > > > > > > 1. Retooling the db-agnostic layer so that we efficiently interact with a > > > JSONB-backed JsonObject directly instead of how the code currently > > expects > > > a standard JSON string and redundantly deserializes it a bunch. > > > 2. Actually using query patterns that index on fields within the JSON > > > > > > However, it's unclear how we expect to do (2) if we also have to support > > > the other persistence backend implementations. Would we just optimize the > > > index on PolicyType just for Postgres but not for the other > > > implementations? > > > > > > If we wanted to make indexing of other subfields first-class, we probably > > > need a better way of conveying that intent. > > > > > > Overall my take is it's fine enough if we think JSONB and/or "separate > > > tables per type" have enough benefits to break the > > > EclipseLink-compatibility behavior, and then using JSONB only as an > > > internal detail is harmless, same as if a persistence implementation > > choose > > > to transparently store text compressed for example. > > > > > > If we're going to use fancy indexing features though we should make sure > > we > > > have an idea of how we intend to do that for other persistence backends > > as > > > well (or ideally, if it fits into a general model we only need to > > implement > > > once for each persistence backend and then can reuse for all sorts of new > > > entity types). > > > > > > > > > On Fri, Mar 21, 2025 at 2:41 PM Yufei Gu <flyrain...@gmail.com> wrote: > > > > > > > Thanks, Prashant, for working on this! > > > > > > > > Using JSONB in the WIP JDBC implementation makes sense. For the > > > EclipseLink > > > > implementation, I'd recommend keeping the current schema as-is (without > > > > JSONB) so existing users don’t need to go through a migration. > > > > > > > > If users do want to migrate, I’d suggest moving directly to the JDBC > > > > implementation rather than adapting EclipseLink to the new JSONB field > > > > > > > > Yufei > > > > > > > > > > > > On Fri, Mar 21, 2025 at 9:08 AM Prashant Singh > > > > <prashant.si...@snowflake.com.invalid> wrote: > > > > > > > > > Hi Robert, > > > > > > > > > > Thank you for your response! > > > > > > > > > > To clarify my point about "non-transactional" operations, I meant > > that > > > > most > > > > > operations don't require full transactions, with the exception of > > > methods > > > > > such as writeEntities. For this specific case, Dennis created a new > > > > > implementation of Polaris MetaStore Manager, AtomicMetaStoreManager > > > [1], > > > > > which: > > > > > > > > > > - Implements PolarisMetaStoreManager using only one-shot atomic > > > > > operations within a BasePersistence implementation. > > > > > - Avoids requirements of open-ended multi-statement transactions. > > > > > > > > > > This approach doesn't necessitate model changes. My understanding is > > > that > > > > > we've agreed on this implementation in its current state (as Dimitri > > > > > approved it), and the JDBCImpl will simply implement BasePersistence > > > [2] > > > > > and utilize AtomicMetaStoreManager to eliminate the need for > > > > EclipseLink's > > > > > transactional operations. > > > > > > > > > > Regarding your comment: > > > > > > > > > > because IIRC the requirements that some people voiced is to keep the > > > same > > > > > schema as the current Eclipselink one > > > > > > > > > > Yes, I'm aiming to maintain the same schema as EclipseLink in the new > > > > JDBC > > > > > implementation to facilitate easy migration. The entity table, grants > > > > > table, and all other schemas remain consistent. > > > > > > > > > > Regarding the JSONB suggestion, I wanted to gather the community's > > > > > thoughts. If we agree, we could incorporate it into both the JDBC and > > > > > EclipseLink implementations, or solely into the JDBC implementation. > > As > > > > > mentioned, JSONB would be beneficial for supporting new query > > patterns. > > > > > > > > > > Please share your thoughts on this. > > > > > > > > > > Best regards, > > > > > > > > > > Prashant Singh > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/AtomicOperationMetaStoreManager.java > > > > > [2] > > > > > > > > > > > > > > > > > > > https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/BasePersistence.java > > > > > > > > > > On Fri, Mar 21, 2025 at 5:23 AM Robert Stupp <sn...@snazy.de> wrote: > > > > > > > > > > > Hey Prashant, > > > > > > > > > > > > thanks for working on this! > > > > > > > > > > > > I'm a bit confused about "non transactional Postgres" - how's PG > > not > > > > > > transactional? If so (like one DML per tx), how is consistency > > > > > > guaranteed, because IIRC the requirements that some people voiced > > is > > > to > > > > > > keep the same schema as the current Eclipselink one. I've been a > > bit > > > > > > confused whether people use that one or not, because different > > people > > > > > > tell different things. Can you shed some light on that? I mean, if > > EL > > > > > > isn't used and there are better ways, why should we keep EL at all? > > > > > > > > > > > > From my experience (via #1189), it's not possible to just issue > > one > > > > DML > > > > > > per PG tx and guarantee consistency with the data model defined by > > > EL. > > > > > > So I suspect you're working on a different database schema. If so, > > > how > > > > > > does it work? > > > > > > > > > > > > Can you share your design and work with everybody here? > > > > > > > > > > > > Robert > > > > > > > > > > > > [1] https://github.com/apache/polaris/pull/1189 > > > > > > > > > > > > On 20.03.25 19:21, Prashant Singh wrote: > > > > > > > Hey folks, > > > > > > > I am presently working on implementing non transactional postgres > > > > > > > implementation using jdbc. in the course of that i noticed in the > > > > > entity > > > > > > > table that the column type of the properties / internal > > properties > > > > > column > > > > > > > is TEXT and not JSONB, I dug a bit into the history of this, > > seems > > > > like > > > > > > we > > > > > > > did wanted to have JSONB but ended up implementing it as TEXT > > (may > > > to > > > > > > > support more relational DB's which doesn't support JSONB) . > > > > > > > > > > > > > > JSONB has a couple of benefits, some of the noticeable ones are > > > that > > > > it > > > > > > > lets you define indexes on them [1], which would help us in > > > > supporting > > > > > > some > > > > > > > different query patterns as we start supporting new entities. > > Until > > > > the > > > > > > > separate table per entity discussion is finalized, this could be > > > > > helpful > > > > > > as > > > > > > > with supporting the new query pattern. For ex policy need to > > > > presently > > > > > do > > > > > > > client side filtering as policyType which is a present inside > > > > > properties, > > > > > > > it needs to be parsed in client side and then do filtering at > > > client > > > > > end. > > > > > > > > > > > > > > Please let me know your thoughts considering the above, happy to > > > put > > > > > out > > > > > > a > > > > > > > PR for the same. > > > > > > > I am happy to handle this in my JDBC impl as this is anyway a > > user > > > > > would > > > > > > be > > > > > > > consciously choosing it over eclipse impl. I would love to know > > > your > > > > > > > thoughts if we need to handle it in eclipse link impl as well. > > > > > > > > > > > > > > > > > > > > > Best, > > > > > > > Prashant > > > > > > > > > > > > > > [1] > > > > > > > > > > > > > > > > > > > > > > > > > > > https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql/ > > > > > > > > > > > > > -- > > > > > > Robert Stupp > > > > > > @snazy > > > > > > > > > > > > > > > > > > > > > > > > > >