[I] SedonaDB doesn't use the column statistics of `bbox` column? [sedona-db]

via GitHub Fri, 13 Feb 2026 23:38:45 -0800


yutannihilation opened a new issue, #617:
URL: https://github.com/apache/sedona-db/issues/617


   When I was trying to do the same thing as the DuckDB SQL in [Overture Maps' 
docs](https://docs.overturemaps.org/getting-data/duckdb/), I found SedonaDB is 
strangely slow compared to DuckDB. 
   
   Benchmark results (seconds):
   | Engine | Median | Mean | Min | Max | Runs | row_count | max_confidence |
   |---|---:|---:|---:|---:|---:|---:|---:|
   | DuckDB | 0.588484 | 2.243619 | 0.585020 | 5.557354 | 3 | 7471 | 
0.999544084072 |
   | SedonaDB | 54.524910 | 55.320437 | 51.254931 | 60.181470 | 3 | 7471 | 
0.999544084072 |
   
   This is the script for benchmarking: 
https://github.com/yutannihilation/sedona-db/commit/a2c5a8fd2177586787df6e335de390a03cbeb2a4
   
   I suspect it's because SedonaDB doesn't utilize the statistics of the bbox 
column for pushdown, because it's faster (but still 10x slower than DuckDB) if 
I replace these conditions on `bbox` with `ST_Intersects(geometry, ...)`. I 
know this DuckDB SQL is a temporary one because  `bbox` column will disappear, 
but I'm wondering if this is an expected result.
   
   ```sql
     WHERE
       categories.primary = 'pizza_restaurant'
       AND bbox.xmin BETWEEN -75 AND -73
       AND bbox.ymin BETWEEN 40 AND 41
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[I] SedonaDB doesn't use the column statistics of `bbox` column? [sedona-db]

Reply via email to