xuzifu666 opened a new pull request, #7919: URL: https://github.com/apache/paimon/pull/7919
### Purpose Paimon currently does not support rtree indexes. Refer to this paper https://postgis.net/docs/support/rtree.pdf for implementation instructions on how to implement this index. The following are the relevant benchmark test results: ### Hardware Configuration - **CPU**: MacBook Pro (M-series processor) - **Memory**: 16GB LPDDR5 - **Operating System**: macOS 14.x ### Software Configuration - **Java Version**: OpenJDK 11+ - **Build**: Maven 3.8.x - ### Test Parameters - **Warmup Iterations**: 3 - **Benchmark Iterations**: 10 - **Query Count**: 1000-10000 queries - **Random Seed**: 42 (for reproducibility) **Query Performance (10,000 queries)** ``` R-Tree: 0.47 µs per query Linear Scan: 464.41 µs per query Speedup: 985.58× Average results per query: 20 records ``` **Analysis by Dataset Size** <!DOCTYPE html> Dataset Size | R-Tree (µs) | Linear Scan (µs) | Speedup | Query Selectivity -- | -- | -- | -- | -- 1K | 0.20 | 14.90 | 75× | 2% 10K | 0.12 | 50.24 | 403× | 2% 100K | 0.35 | 492.44 | 1407× | 2% 1M | 0.39 | 495.25 | 1279× | 2% <!DOCTYPE html> Query Type | Area Size | R-Tree (µs) | Linear Scan (µs) | Speedup | Selectivity -- | -- | -- | -- | -- | -- Small | 500×500 | 0.22 | 366.27 | 1684× | 0.02% Medium | 1500×1500 | 0.21 | 400.52 | 1899× | 0.02% Large | 5000×5000 | 0.28 | 556.48 | 1997× | 0.03% **Point Query vs Range Query** Search Performance on 100K Dataset: ``` Point queries (1000): 303.76 µs/query (with warmup optimization) Range queries (100): 357.04 µs/query Linear scan (100 scans): 65170.62 µs/scan ``` Improvement vs Linear Scan: Point query: 214× speedup Range query: 182× speedup Sequential Data Access Pattern ``` 1M grid data (1000×1000 points) Average query time: 1.54 µs Results returned: 30 records Performance Characteristics: - First query: 8.38 µs (cache warmup) - Subsequent queries: 0.67-0.88 µs (steady state) ``` ### Tests # Run comparison benchmark java -cp paimon-common/target/test-classes:paimon-common/target/classes \ org.apache.paimon.fileindex.rtree.RTreeVsLinearScanBenchmark # Run detailed benchmark java -cp paimon-common/target/test-classes:paimon-common/target/classes \ org.apache.paimon.fileindex.rtree.RTreeBenchmark -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
