Hi All,
I have been experimenting with running Polaris on PostgreSQL where multiple servers access the same database. I present two specific test cases in the appendixes below for review. The test cases are possible in real life even though they may look a bit contrived in the current state of the code. The errors that come out of running those test requests are valid in themselves, however, I’d like this discussion to avoid diving right into fixing the specific errors, but concentrate on the conceptual correctness aspects of Polaris interactions with the database. My personal interpretation of the test outcomes (having current Polaris code in mind) is that interactions with the database are not well-defined in terms of expected commit behaviours and transaction isolation levels. Note: the test case using Serializable isolation is largely inspired by code comments in TransactionWorkspaceMetaStoreManager. Given interest in NoSQL backends expressed in previous discussions, I’d like this discussion to focus on how Polaris persistence-related interfaces can be defined so that clear expectations are stated on database behaviours so as to enable reasoning about correctness of alternative persistence implementations on a conceptual level (i.e. without relying on tests to prove correctness (while using TDD to validate code, of course)). Please share your thoughts on this matter. >From my personal point of view, I’d like to refactor what TransactionWorkspaceMetaStoreManager does into a first class Polaris service object (or interface) as opposed to being an ad-hoc runtime hook for some special cases. The end goal being a REST services refactoring where changes are accumulated in-memory and then sent to the database-specific persistence implementation to be committed as a whole. The expectations on the database would be that the reads and writes made in the context of a “session” when it is committed must be non-conflicting with concurrent reads and writes made by other sessions. The important distinction with respect to EclipseLink (or any other JPA implementation) is that reads and writes in the proposed approach are reads and writes made by Polaris Services, not by the JPA middleware. This way consistency concerns are made explicit in Polaris code. The “session” model should also be flexible enough to allow different implementations to define session boundaries and resolve read/write conflicts according to the backend capabilities for a wide range of database systems. Thanks, Dmitri. ======== Appendix A: Unrelated Changes with Serializable Tx Isolation ======= Test code (old, raw, but functional): https://github.com/dimas-b/polaris/commit/88960348906c33452cc4a81cf39690272bbb191c Test case: - Run PostgreSQL with the “default-realm” database configuration with “serializable” Tx isolation mode by default - alter database "default-realm" set default_transaction_isolation = serializable; - Bootstrap Polaris (with EclipseLink) - Create Polaris catalog (e.g. “polaris2”) with FILE storage - ./polaris --client-id *** --client-secret *** catalogs create polaris2 --storage-type FILE --default-base-location file:///tmp/pol2 - Create test tables in two separate namespaces - java -jar test-client/build/libs/test-client-1.0.0-incubating-SNAPSHOT.jar http://localhost:8181/api/catalog 5 <NAMESPACE> create - Run two Polaris servers connected to the same PostgreSQL database, but - Run two test clients, each connected to its own Polaris server and using its own namespace - java -jar test-client/build/libs/test-client-1.0.0-incubating-SNAPSHOT.jar http://localhost:8181/api/catalog 5 ns4 update w_0_ - java -jar test-client/build/libs/test-client-1.0.0-incubating-SNAPSHOT.jar http://localhost:8191/api/catalog 5 ns5 update w_1_ The clients modify 5 tables together in the same Iceberg commit, but in different (!) namespaces. After a (short) while, one of the clients will get this error: Caused by: org.apache.iceberg.exceptions.ServiceFailureException: Service failed: 500: Exception [EclipseLink-4002] (Eclipse Persistence Services - 4.0.4.v202407190748-059428cdd2583c46f1f3e50d235854840a6fa9a7): org.eclipse.persistence.exceptions.DatabaseException Internal Exception: org.postgresql.util.PSQLException: ERROR: could not serialize access due to read/write dependencies among transactions Detail: Reason code: Canceled on identification as a pivot, during conflict out checking. Hint: The transaction might succeed if retried. Error Code: 0 Call: SELECT CATALOGID, ID, PARENTID, TYPECODE, NAME, SUBTYPECODE FROM ENTITIES_ACTIVE WHERE ((((CATALOGID = ?) AND (PARENTID = ?)) AND (TYPECODE = ?)) AND (NAME = ?)) bind => [4 parameters bound] Query: ReadAllQuery(referenceClass=ModelEntityActive sql="SELECT CATALOGID, ID, PARENTID, TYPECODE, NAME, SUBTYPECODE FROM ENTITIES_ACTIVE WHERE ((((CATALOGID = ?) AND (PARENTID = ?)) AND (TYPECODE = ?)) AND (NAME = ?))") ... 11 more (I can provide the full exception stack trace if somebody is interested). ======== Appendix B: Concurrent Changes with Read Committed Isolation Level ======= Similar setup as above, but with client side retries and “read committed” isolation level (default in PostgreSQL). Client code: https://github.com/dimas-b/polaris/commit/070878fd2f6bb2731780649e0bc980865bf57317 Run both clients on the same namespace, but modifying different schema columns. - java -jar test-client/build/libs/test-client-1.0.0-incubating-SNAPSHOT.jar http://localhost:8181/api/catalog 5 ns1 update w_0_ - java -jar test-client/build/libs/test-client-1.0.0-incubating-SNAPSHOT.jar http://localhost:8191/api/catalog 5 ns1 update w_1_ This failure is a bit trickier to reproduce and may require a few client-side restarts, but eventually one of the clients will get this error: Caused by: org.apache.iceberg.exceptions.ServiceFailureException: Service failed: 500: jakarta.persistence.OptimisticLockException: Exception [EclipseLink-5006] (Eclipse Persistence Services - 4.0.4.v202407190748-059428cdd2583c46f1f3e50d235854840a6fa9a7): org.eclipse.persistence.exceptions.OptimisticLockException Exception Description: The object [org.apache.polaris.core.persistence.models.ModelEntity@62e2ea2b] cannot be updated because it has changed or been deleted since it was last read. Class> org.apache.polaris.core.persistence.models.ModelEntity Primary Key> [[51, 54]: 2596] ... 11 more