dennishuo commented on issue #1076:
URL: https://github.com/apache/polaris/issues/1076#issuecomment-2692686409

   Thanks for the report, and the handy benchmark tools! Did some digging, 
looks like there are actually three distinct bugs in play:
   
   1. Failure to configure the db layer with the necessary isolation level 
(SERIALIZABLE) - in normal prod scenarios this would be configured differently 
depending on the backend, e.g. if using a "managed Postgres DB", but certainly 
we should have defense-in-depth - (a) Set the right default in a handy postgres 
conf file that we add to our getting-started, and (b) Make our EclipseLink 
initialization automatically set SERIALIZABLE transaction isolation as well
   2. *Apparently* EclipseLink uses a different read connection vs write 
connection even in the same `EntityManager` session and normally even if 
there's a transaction. This means the two connections aren't part of the same 
transaction, and the reads see a different/dirty view of the database compared 
to the decisions leading to the writes. 
https://docs.oracle.com/middleware/1221/toplink/concepts/data_access.htm#CHDJGAJE
 
https://eclipse.dev/eclipselink/documentation/2.4/jpa/extensions/p_transaction.htm
   3. There's an unfortunate TODO that fell through the cracks in 
`BasePolarisCatalog` in `createTableLike` and `updateTableLike` -- 
historically, it seems the DB layer vacillated between returning a status enum 
vs throwing an exception, so code at one time relied on exceptions escaping up 
to cause error return codes. But now with return status enums, the enums need 
to be translated back into exceptions.
   
   It turns out (1b) requires a bit of extra injection to make it happen: 
https://eclipse.dev/eclipselink/documentation/2.6/dbws/creating_dbws_services002.htm
 (I also experimented with trying to unpack the database session at runtime in 
`PolarisEclipseLinkMetaStoreSessionImpl` itself and did also get it to work but 
it's hacky/messy)
   
   Number (2) just requires adding to `persistence.xml`:
   
         <property name="eclipselink.transaction.join-existing" value="true"/>
   
   (1) and (2) together are both needed to avoid corruption of the backend 
database. If these two are fixed, then at least the data model and underlying 
behavior is correct, but the only problem is erroneous `200` return codes.
   
   (3) Requires handling the returned `EntityResult` in `createTableLike` and 
`updateTableLike` in `BasePolarisCatalog`. Fixing this also fixes the erroneous 
`200` return codes.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to