yutannihilation opened a new issue, #617: URL: https://github.com/apache/sedona-db/issues/617
When I was trying to do the same thing as the DuckDB SQL in [Overture Maps' docs](https://docs.overturemaps.org/getting-data/duckdb/), I found SedonaDB is strangely slow compared to DuckDB. Benchmark results (seconds): | Engine | Median | Mean | Min | Max | Runs | row_count | max_confidence | |---|---:|---:|---:|---:|---:|---:|---:| | DuckDB | 0.588484 | 2.243619 | 0.585020 | 5.557354 | 3 | 7471 | 0.999544084072 | | SedonaDB | 54.524910 | 55.320437 | 51.254931 | 60.181470 | 3 | 7471 | 0.999544084072 | This is the script for benchmarking: https://github.com/yutannihilation/sedona-db/commit/a2c5a8fd2177586787df6e335de390a03cbeb2a4 I suspect it's because SedonaDB doesn't utilize the statistics of the bbox column for pushdown, because it's faster (but still 10x slower than DuckDB) if I replace these conditions on `bbox` with `ST_Intersects(geometry, ...)`. I know this DuckDB SQL is a temporary one because `bbox` column will disappear, but I'm wondering if this is an expected result. ```sql WHERE categories.primary = 'pizza_restaurant' AND bbox.xmin BETWEEN -75 AND -73 AND bbox.ymin BETWEEN 40 AND 41 ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
