pingtimeout opened a new issue, #1123:
URL: https://github.com/apache/polaris/issues/1123

   ### Describe the bug
   
   Under a low concurrency test, entity updates are rejected with an HTTP 500 
error.  This applies for catalog updates, namespace creation, table updates, ...
   
   ### To Reproduce
   
   * Check out this commit: 
https://github.com/apache/polaris/commit/fbd3b90e431c3b8ecd5077cb0a563cab27670c57
   * Run the server using the getting-started docker compose file: `docker 
compose -f getting-started/eclipselink/docker-compose.yml up`
   * Export the client ID and secrets as environment variables: `export 
CLIENT_ID=root CLIENT_SECRET=s3cr3t`
   * Run `./gradlew :polaris-benchmarks:gatlingRun`
   
   The simulation will run 5 concurrent users.  Each user creates its own 
catalog (named `C_0`, `C_1`, ...).  Then, each user sequentially creates 5 
namespaces under its own catalog (named `NS_0`, `NS_1`, ...).
   
   ### Actual Behavior
   
   The Gatling output consistently shows that not all catalogs nor all 
namespaces could be created.  In the output below, only `1` catalog was created 
and the other 4 creations were rejected with an HTTP 500 error.
   
   ```
   
   
========================================================================================================================
   2025-03-05 16:42:50 UTC                                                      
                         0s elapsed
   ---- Requests 
-----------------------------------------------------------------------|---Total---|-----OK----|----KO----
   > Global                                                                     
        |        35 |        11 |        24
   > Authenticate                                                               
        |         5 |         5 |         0
   > Create Catalog                                                             
        |         5 |         1 |         4
   > Create Namespace                                                           
        |        25 |         5 |        20
   ---- Errors 
------------------------------------------------------------------------------------------------------------
   > status.find.is(200), but actually found 404                                
                                20 (83.33%)
   > status.find.is(201), but actually found 500                                
                                 4 (16.67%)
   ```
   
   [This file](https://github.com/user-attachments/files/19093245/polaris.log) 
is the server log for the Polaris instance.  It contains numerous errors like 
the one below
   
   ```
   2025-03-05 16:35:20 INFO  
[org.apache.polaris.service.exception.IcebergExceptionMapper] 
(executor-thread-1) Handling runtimeException Exception [EclipseLink-4002] 
(Eclipse Persistence Services - 
4.0.5.v202412231137-a96b873527f305f932543045c8679bb1de8d3a43): 
org.eclipse.persistence.exceptions.DatabaseException
   Internal Exception: org.postgresql.util.PSQLException: ERROR: could not 
serialize access due to read/write dependencies among transactions
     Detail: Reason code: Canceled on identification as a pivot, during 
conflict out checking.
     Hint: The transaction might succeed if retried.
   Error Code: 0
   Call: UPDATE ENTITIES SET GRANTRECORDSVERSION = ?, VERSION = ? WHERE 
(((CATALOGID = ?) AND (ID = ?)) AND (VERSION = ?))
           bind => [5 parameters bound]
   Query: UpdateObjectQuery(org.apache.polaris.jpa.models.ModelEntity@2da9cea8)
   ```
   
   Those errors are not caught and result in a HTTP 500 response to be sent to 
the client.  Here is the payload that is received on the Gatling side:
   
   ```
   {"error":{"message":"Exception [EclipseLink-4002] (Eclipse Persistence 
Services - 4.0.5.v202412231137-a96b873527f305f932543045c8679bb1de8d3a43): 
org.eclipse.persistence.exceptions.DatabaseException\nInternal Exception: 
org.postgresql.util.
   PSQLException: ERROR: could not serialize access due to concurrent 
update\nError Code: 0\nCall: UPDATE ENTITIES SET GRANTRECORDSVERSION = ?, 
VERSION = ? WHERE (((CATALOGID = ?) AND (ID = ?)) AND (VERSION = ?))\n\tbind => 
[5 parameters boun
   d]\nQuery: 
UpdateObjectQuery(org.apache.polaris.jpa.models.ModelEntity@1123186d)","type":"PersistenceException","code":500}}
   ```
   
   ### Expected Behavior
   
   Given that there is no overlap between catalogs and namespaces, all queries 
should succeed.
   
   ### Additional context
   
   This result was reproduced even after 
https://github.com/apache/polaris/pull/1092 has been merged on `main`.
   
   ### System information
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org

Reply via email to