Using JSONB in the PostgreSQL-specific implementation sounds reasonable to
me.

My impression from the in-progress PRs on this subject is that the effort
is focused
on PostgreSQL and reusing the SQL schema in different RDBMS systems is not
actually expected (or planned).

On the topic of compatibility with EclipseLink, I do believe that a
no-downtime
upgrade from PostgreSQL + EclipseLink to PostgreSQL + JDBC is nearly
impossible.

On upgrade paths with downtime, users can execute some code to migrate the
schema to the new types (if desired), or use the upcoming catalog migrator
tool.

Cheers,
Dmitri.


On Tue, Apr 1, 2025 at 11:44 AM Prashant Singh
<prashant.si...@snowflake.com.invalid> wrote:

> Thank you Dennis and Eric.
>
> I agree whenever we introduce (2) in BasePersistence we should definitely
> re:think if this is a query pattern
> that can be supported by most of the DBs (Relational / NoSQL) before
> accepting it for BasePersistence.
> Presently this specific change is not gonna introduce any new query
> pattern, so I think that should be fine.
> Given that BasePersistence does not change it's actually an implementation
> detail on how we store and query the underlying objects within
> BasePersistence impl.
>
> Regarding the zero downtime, 100% backward compatibility goal, I think we
> are on the same page that it was our best effort.
>
> IMHO JSON B is the right thing to do as the underlying object is actually a
> JSON, for even debugging purposes, or telemetry, presently a user debugging
> some field in their prop will have to do a string search for the text.
>
> Considering the above please let me know what your thoughts are ?
>
> Best,
> Prashant Singh
>
> On Sat, Mar 29, 2025 at 1:16 AM Eric Maynard <eric.w.mayn...@gmail.com>
> wrote:
>
> > I'm generally a +1 on using JSONB within Postgres.
> >
> > However I am in agreement with Dennis that we should avoid his item (2)
> > above. If the application will need to load entities by some attributes
> A,
> > B, and C then we should create methods loadEntityByA(a: A),
> > loadEntityByB(b: B), and loadEntityByC(c: C) rather than writing some
> > catch-all loadEntityByPredicate(?: ?). In this way, it will be clearly
> > defined what the access patterns to the metastore are expected to be and
> > what the contract metastore implementations are expected to fulfill is.
> >
> > Whether the postgres metastore decides to implement loadEntityByC using a
> > column C or by using a query against a JSON blob, I suppose that we can
> > make that evaluation on a case-by-case basis. I think at least having a
> > JSONB column seems reasonable.
> >
> > On Fri, Mar 21, 2025 at 3:40 PM Dennis Huo <huoi...@gmail.com> wrote:
> >
> > > Re: "non-transactional", I guess it's more descriptive to refer to this
> > new
> > > implementation as "AtomicOperationJdbcPersistence" if it's confusing to
> > say
> > > "non-transactional"; the key is simply that we don't expose any
> > > "runInTransaction" to the AtomicOperationMetaStoreManager layer.
> > >
> > > Regarding Robert's question about "same schema", the important
> > distinction
> > > is that all the prior discussions about "ease of migration" are more
> > > focused on in-place *unidirectional schema compatibility*, not
> > necessarily
> > > *exact same schema*.
> > >
> > > This distinction is important because it's true that the current
> > > EclipseLink implementation's three tightly-coupled tables ModelEntity,
> > > ModelEntityActive, and ModelEntityChangeTracking indeed are not all
> > carried
> > > over to the "AtomicOperationJdbcPersistence" implementation, and that
> > > keeping those three tables means keeping inefficient transaction
> > behaviors.
> > >
> > > However, keeping "ModelEntity" but adding a secondary index on it
> allows
> > > unidirectional compatibility (migrating from old EclipseLink -> new
> > > AtomicOperationJdbcPersistence), without still keeping those three
> > > tightly-coupled tables.
> > >
> > > I suppose it's also true though that backwards-compatibility right now
> is
> > > kind of "best effort" at this stage in the project; if we do discover
> > > unforeseen serious shortcomings in trying to keep the "ModelEntity"
> table
> > > the same, it's reasonable to consider giving existing users a different
> > > migration path and defining a totally different schema if needed. But
> by
> > > default it's "nice" if it happens to be easy to preserve that
> > compatibility
> > >
> > > Regarding JSONB, on the one hand, self-contained optimizations within a
> > > given persistence implementation (e.g. silently re-serialize any
> > retrieved
> > > JSONB-backed JsonObjects into a plain JSON String to keep the current
> > > PolarisBaseEntity interface unchanged) seem safe enough but we wouldn't
> > > really reap much benefit (other than maybe a slight generalized
> > performance
> > > improvement) without either:
> > >
> > > 1. Retooling the db-agnostic layer so that we efficiently interact
> with a
> > > JSONB-backed JsonObject directly instead of how the code currently
> > expects
> > > a standard JSON string and redundantly deserializes it a bunch.
> > > 2. Actually using query patterns that index on fields within the JSON
> > >
> > > However, it's unclear how we expect to do (2) if we also have to
> support
> > > the other persistence backend implementations. Would we just optimize
> the
> > > index on PolicyType just for Postgres but not for the other
> > > implementations?
> > >
> > > If we wanted to make indexing of other subfields first-class, we
> probably
> > > need a better way of conveying that intent.
> > >
> > > Overall my take is it's fine enough if we think JSONB and/or "separate
> > > tables per type" have enough benefits to break the
> > > EclipseLink-compatibility behavior, and then using JSONB only as an
> > > internal detail is harmless, same as if a persistence implementation
> > choose
> > > to transparently store text compressed for example.
> > >
> > > If we're going to use fancy indexing features though we should make
> sure
> > we
> > > have an idea of how we intend to do that for other persistence backends
> > as
> > > well (or ideally, if it fits into a general model we only need to
> > implement
> > > once for each persistence backend and then can reuse for all sorts of
> new
> > > entity types).
> > >
> > >
> > > On Fri, Mar 21, 2025 at 2:41 PM Yufei Gu <flyrain...@gmail.com> wrote:
> > >
> > > > Thanks, Prashant, for working on this!
> > > >
> > > > Using JSONB in the WIP JDBC implementation makes sense. For the
> > > EclipseLink
> > > > implementation, I'd recommend keeping the current schema as-is
> (without
> > > > JSONB) so existing users don’t need to go through a migration.
> > > >
> > > > If users do want to migrate, I’d suggest moving directly to the JDBC
> > > > implementation rather than adapting EclipseLink to the new JSONB
> field
> > > >
> > > > Yufei
> > > >
> > > >
> > > > On Fri, Mar 21, 2025 at 9:08 AM Prashant Singh
> > > > <prashant.si...@snowflake.com.invalid> wrote:
> > > >
> > > > > Hi Robert,
> > > > >
> > > > > Thank you for your response!
> > > > >
> > > > > To clarify my point about "non-transactional" operations, I meant
> > that
> > > > most
> > > > > operations don't require full transactions, with the exception of
> > > methods
> > > > > such as writeEntities. For this specific case, Dennis created a new
> > > > > implementation of Polaris MetaStore Manager, AtomicMetaStoreManager
> > > [1],
> > > > > which:
> > > > >
> > > > >    - Implements PolarisMetaStoreManager using only one-shot atomic
> > > > >    operations within a BasePersistence implementation.
> > > > >    - Avoids requirements of open-ended multi-statement
> transactions.
> > > > >
> > > > > This approach doesn't necessitate model changes. My understanding
> is
> > > that
> > > > > we've agreed on this implementation in its current state (as
> Dimitri
> > > > > approved it), and the JDBCImpl will simply implement
> BasePersistence
> > > [2]
> > > > > and utilize AtomicMetaStoreManager to eliminate the need for
> > > > EclipseLink's
> > > > > transactional operations.
> > > > >
> > > > > Regarding your comment:
> > > > >
> > > > > because IIRC the requirements that some people voiced is to keep
> the
> > > same
> > > > > schema as the current Eclipselink one
> > > > >
> > > > > Yes, I'm aiming to maintain the same schema as EclipseLink in the
> new
> > > > JDBC
> > > > > implementation to facilitate easy migration. The entity table,
> grants
> > > > > table, and all other schemas remain consistent.
> > > > >
> > > > > Regarding the JSONB suggestion, I wanted to gather the community's
> > > > > thoughts. If we agree, we could incorporate it into both the JDBC
> and
> > > > > EclipseLink implementations, or solely into the JDBC
> implementation.
> > As
> > > > > mentioned, JSONB would be beneficial for supporting new query
> > patterns.
> > > > >
> > > > > Please share your thoughts on this.
> > > > >
> > > > > Best regards,
> > > > >
> > > > > Prashant Singh
> > > > >
> > > > > [1]
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/AtomicOperationMetaStoreManager.java
> > > > > [2]
> > > > >
> > > > >
> > > >
> > >
> >
> https://github.com/apache/polaris/blob/main/polaris-core/src/main/java/org/apache/polaris/core/persistence/BasePersistence.java
> > > > >
> > > > > On Fri, Mar 21, 2025 at 5:23 AM Robert Stupp <sn...@snazy.de>
> wrote:
> > > > >
> > > > > > Hey Prashant,
> > > > > >
> > > > > > thanks for working on this!
> > > > > >
> > > > > > I'm a bit confused about "non transactional Postgres" - how's PG
> > not
> > > > > > transactional? If so (like one DML per tx), how is consistency
> > > > > > guaranteed, because IIRC the requirements that some people voiced
> > is
> > > to
> > > > > > keep the same schema as the current Eclipselink one. I've been a
> > bit
> > > > > > confused whether people use that one or not, because different
> > people
> > > > > > tell different things. Can you shed some light on that? I mean,
> if
> > EL
> > > > > > isn't used and there are better ways, why should we keep EL at
> all?
> > > > > >
> > > > > >  From my experience (via #1189), it's not possible to just issue
> > one
> > > > DML
> > > > > > per PG tx and guarantee consistency with the data model defined
> by
> > > EL.
> > > > > > So I suspect you're working on a different database schema. If
> so,
> > > how
> > > > > > does it work?
> > > > > >
> > > > > > Can you share your design and work with everybody here?
> > > > > >
> > > > > > Robert
> > > > > >
> > > > > > [1] https://github.com/apache/polaris/pull/1189
> > > > > >
> > > > > > On 20.03.25 19:21, Prashant Singh wrote:
> > > > > > > Hey folks,
> > > > > > > I am presently working on implementing non transactional
> postgres
> > > > > > > implementation using jdbc. in the course of that i noticed in
> the
> > > > > entity
> > > > > > > table that the column type of the properties / internal
> > properties
> > > > > column
> > > > > > > is TEXT and not JSONB, I dug a bit into the history of this,
> > seems
> > > > like
> > > > > > we
> > > > > > > did wanted to have JSONB but ended up implementing it as TEXT
> > (may
> > > to
> > > > > > > support more relational DB's which doesn't support JSONB) .
> > > > > > >
> > > > > > > JSONB has a couple of benefits, some of the noticeable ones are
> > > that
> > > > it
> > > > > > > lets you define indexes on them [1], which would help us in
> > > > supporting
> > > > > > some
> > > > > > > different query patterns as we start supporting new entities.
> > Until
> > > > the
> > > > > > > separate table per entity discussion is finalized, this could
> be
> > > > > helpful
> > > > > > as
> > > > > > > with supporting the new query pattern. For ex policy need to
> > > > presently
> > > > > do
> > > > > > > client side filtering as policyType which is a present inside
> > > > > properties,
> > > > > > > it needs to be parsed in client side and then do filtering at
> > > client
> > > > > end.
> > > > > > >
> > > > > > > Please let me know your thoughts considering the above, happy
> to
> > > put
> > > > > out
> > > > > > a
> > > > > > > PR for the same.
> > > > > > > I am happy to handle this in my JDBC impl as this is anyway a
> > user
> > > > > would
> > > > > > be
> > > > > > > consciously choosing it over eclipse impl. I would love to know
> > > your
> > > > > > > thoughts if we need to handle it in eclipse link impl as well.
> > > > > > >
> > > > > > >
> > > > > > > Best,
> > > > > > > Prashant
> > > > > > >
> > > > > > > [1]
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> https://scalegrid.io/blog/using-jsonb-in-postgresql-how-to-effectively-store-index-json-data-in-postgresql/
> > > > > > >
> > > > > > --
> > > > > > Robert Stupp
> > > > > > @snazy
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Reply via email to