pingtimeout opened a new issue, #1076:
URL: https://github.com/apache/polaris/issues/1076

   ### Describe the bug
   
   The persistence layer allows the creation of tables with identical name 
within the same namespace, which should not be possible.  Additionally, it 
seems to be losing some writes.
   
   ### To Reproduce
   
   * Check out this commit: 
https://github.com/apache/polaris/commit/4cf08d40ca2eff2bfd59701ee4c7c7184f97b315
   * Run the server using the getting-started docker compose file: docker 
compose -f getting-started/eclipselink/docker-compose.yml up
   * Export the client ID and secrets as environment variables: `export 
CLIENT_ID=root CLIENT_SECRET=s3cr3t`
   * Run ./gradlew :polaris-benchmarks:gatlingRun
   
   The simulation will create a catalog named `C_0`, a namespace named `N_0` 
and then will send 50 simultaneous table creation queries for a table named 
`T_1000`.
   
   ### Actual Behavior
   
   The Gatling output shows that all 50 table creation requests returned with 
an `HTTP 200 OK` code.
   
   ```
   
   
========================================================================================================================
   2025-02-27 09:47:34 UTC                                                      
                         3s elapsed
   ---- Requests 
-----------------------------------------------------------------------|---Total---|-----OK----|----KO----
   > Global                                                                     
        |       103 |       103 |         0
   > Authenticate                                                               
        |        51 |        51 |         0
   > Create Catalog                                                             
        |         1 |         1 |         0
   > Create Namespace                                                           
        |         1 |         1 |         0
   > Create Table                                                               
        |        50 |        50 |         0
   ```
   
   A `curl` command that lists the tables under namespace `NS_0` shows that 
there are 23 tables with that name.
   
   ```
   $ curl \
       -s \
       "http://localhost:8181/api/catalog/v1/C_0/namespaces/NS_0/tables"; \
       -H "Content-Type: application/json" \
       -H "Authorization: Bearer $TOKEN" \
     | jq '.identifiers[].name' \
     | wc -l
   23
   ```
   
   [This file](https://github.com/user-attachments/files/19006679/tables.json) 
is the complete output from `GET /api/catalog/v1/C_0/namespaces/NS_0/tables`
   
   ### Expected Behavior
   
   Only a single `POST /api/catalog/v1/C_0/namespaces/NS_0/tables` should 
succeed.  The remaining 49 queries should be rejected with a `HTTP 409` error 
code.
   
   Additionally, given that the 50 `POST 
/api/catalog/v1/C_0/namespaces/NS_0/tables` succeeded, but only 23 tables were 
actually created, it means that the server lost some writes.  For this 
particular case, it is not that big of a deal as this is an invalid situation.  
But this raises the question whether other writes can be lost under high 
concurrency.
   
   ### Additional context
   
   The issue is reproducible fairly consistently.  To make it even easier to 
reproduce, increase log verbosity (e.g. `-Dquarkus.log.level=DEBUG 
-Dquarkus.log.category.\"org.apache.polaris\".level=DEBUG 
-Dquarkus.log.category.\"org.apache.iceberg..rest\".level=DEBUG 
-Dquarkus.log.category.\"io.smallrye.config\".level=DEBUG `)
   
   ### System information
   
   _No response_


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to