eric-maynard opened a new pull request, #1686: URL: https://github.com/apache/polaris/pull/1686
The location overlap check for "sibling" tables (those which share a parent) has been a performance bottleneck since its introduction, but we haven't historically had a good way around this other than just disabling the check. <hr> ### Current Behavior The current logic is that when we create a table, we list all sibling tables and check each and every one to ensure there is no location overlap. This results in O(N^2) checks when adding N tables to a namespace, quickly becoming untenable. With the `CreateTreeDataset` [benchmark](https://github.com/eric-maynard/polaris-tools/blob/main/benchmarks/src/gatling/scala/org/apache/polaris/benchmarks/simulations/CreateTreeDataset.scala) I tested creating 5000 sibling tables using the current code: <img width="700" alt="Screenshot 2025-05-27 at 4 26 56 PM" src="https://github.com/user-attachments/assets/f6fcc214-3ff8-49b8-b0eb-4bed7360d41a" /> It is apparent that latency increases over time. Runs took between 90 and 200+ seconds, and Polaris instances with a small memory allocation were prone to crashing due to OOMs: <img width="500" alt="Screenshot 2025-05-27 at 4 33 57 PM" src="https://github.com/user-attachments/assets/71d8224e-eaf8-4d0b-9cd5-51e00204dc97" /> ### Proposed change This PR adds a new persistence API, `hasOverlappingSiblings`, which if implemented can be used to directly check for the presence of siblings at the metastore layer. This API is implemented for the JDBC metastore in a new schema version, and some changes are made to account for an evolving schema version now and in the future. This implementation breaks a location down into components and queries for a sibling at each of those locations, so a new table at location `s3://bucket/root/n1/nA/t1/` will require checking for an entity with location `s3://bucket/`, `s3://bucket/root/`, `s3://bucket/root/n1/`, `s3://bucket/root/n1/nA/`, and finally `s3://bucket/root/n1/nA/t1/%`. All of this can be done in a single query which makes a single pass over the data. The query is optimized by the introduction of a new index over a new _location_ column. With the changes enabled, I tested creating 5000 sibling tables: <img width="700" alt="Screenshot 2025-05-27 at 4 32 12 PM" src="https://github.com/user-attachments/assets/1e9ffd59-ed7d-4923-831e-6e35e2028fe2" /> Latency is stable over time, and runs consistently completed in less than 30 seconds. I did not observe any OOMs when testing with the feature enabled. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@polaris.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org