dennishuo commented on issue #1076: URL: https://github.com/apache/polaris/issues/1076#issuecomment-2692686409
Thanks for the report, and the handy benchmark tools! Did some digging, looks like there are actually three distinct bugs in play: 1. Failure to configure the db layer with the necessary isolation level (SERIALIZABLE) - in normal prod scenarios this would be configured differently depending on the backend, e.g. if using a "managed Postgres DB", but certainly we should have defense-in-depth - (a) Set the right default in a handy postgres conf file that we add to our getting-started, and (b) Make our EclipseLink initialization automatically set SERIALIZABLE transaction isolation as well 2. *Apparently* EclipseLink uses a different read connection vs write connection even in the same `EntityManager` session and normally even if there's a transaction. This means the two connections aren't part of the same transaction, and the reads see a different/dirty view of the database compared to the decisions leading to the writes. https://docs.oracle.com/middleware/1221/toplink/concepts/data_access.htm#CHDJGAJE https://eclipse.dev/eclipselink/documentation/2.4/jpa/extensions/p_transaction.htm 3. There's an unfortunate TODO that fell through the cracks in `BasePolarisCatalog` in `createTableLike` and `updateTableLike` -- historically, it seems the DB layer vacillated between returning a status enum vs throwing an exception, so code at one time relied on exceptions escaping up to cause error return codes. But now with return status enums, the enums need to be translated back into exceptions. It turns out (1b) requires a bit of extra injection to make it happen: https://eclipse.dev/eclipselink/documentation/2.6/dbws/creating_dbws_services002.htm (I also experimented with trying to unpack the database session at runtime in `PolarisEclipseLinkMetaStoreSessionImpl` itself and did also get it to work but it's hacky/messy) Number (2) just requires adding to `persistence.xml`: <property name="eclipselink.transaction.join-existing" value="true"/> (1) and (2) together are both needed to avoid corruption of the backend database. If these two are fixed, then at least the data model and underlying behavior is correct, but the only problem is erroneous `200` return codes. (3) Requires handling the returned `EntityResult` in `createTableLike` and `updateTableLike` in `BasePolarisCatalog`. Fixing this also fixes the erroneous `200` return codes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
