Hi Tim,

yes you're running into a well known s/limitation/feature/ of spatial SQLite (I'm pretty sure that the Spatialite website has one page about that, but can't find it right now). The use of the ST_ functions doesn't trigger the RTree automatically, and you need to query it explicitly by joining with the rtree_{table_name}_{geom_column_name} virtual table.

For example the following completes quite quickly:

$ ogrinfo -ro -al -q /vsicurl/https://storage.googleapis.com/open-geodata/linz-examples/nz-building-outlines.gpkg -sql "SELECT * FROM nz_building_outlines p JOIN rtree_nz_building_outlines_geom rt ON p.rowid = rt.id WHERE rt.minx < 1771150 and rt.maxx > 1771140 and rt.miny < 5894877 and rt.maxy > 5894870 and ST_Intersects(p.geom, ST_GeomFromText('POINT(1771141 5894876)')) = 1"

Layer name: SELECT
OGRFeature(SELECT):11471
  building_id (Integer) = 4199821
  name (String) =
  use (String) = Unknown
  suburb_locality (String) = Karaka
  town_city (String) =
  territorial_authority (String) = Auckland
  capture_method (String) = Feature Extraction
  capture_source_group (String) = NZ Aerial Imagery
  capture_source_id (Integer) = 1014
  capture_source_name (String) = Auckland 0.075m Urban Aerial Photos (2017)
  capture_source_from (Date) = 2017/03/15
  capture_source_to (Date) = 2017/05/06
  last_modified (Date) = 2019/04/05
  id (Integer64) = 11471
  minx (Real) = 1771139.875
  maxx (Real) = 1771152
  miny (Real) = 5894865.5
  maxy (Real) = 5894883
  MULTIPOLYGON (((1771139.91348946 5894880.21926274,1771146.68318411 5894882.80032223,1771151.97657734 5894868.91662014,1771145.02123927 5894866.26478195,1771143.49962591 5894870.255725,1771142.39909482 5894869.83612922,1771140.01981817 5894876.07658206,1771141.30599197 5894876.56695731,1771139.91348946 5894880.21926274)))


Even

Le 25/09/2025 à 01:38, Tim Harris via gdal-dev a écrit :
Hi, trying to do some queries of fairly large GPKG files sitting in S3 in the hopes that spatial indexes and S3 range requests make it fast and efficient. But it seems that's not the case and I'm wondering if I'm doing something wrong.

The examples below are just looking for a geometry at a point. The filename is fake but the geoms.gpkg contains a layer "geoms" with a bunch of polygons. It's about 1.7 GB in size.

Something like this runs relatively fast, it makes about 17 HTTP requests and outputs the feature:

---
ogrinfo -ro -json -features /vsis3/bucket/geoms.gpkg geoms -spat -105 40 -105 40
---

But this, using -sql instead of -spat, sits forever without finishing. With CPL_CURL_VERBOSE it sort of looks like it's just scanning through the file a range request at a time:

---
ogrinfo -ro -json -features /vsis3/bucket/geoms.gpkg geoms -sql "select * from geoms where ST_Intersects(geom, ST_GeomFromText('POINT(-105 40)', 4326)) = 1"
---

I also tried replacing the ST_GeomFromText with other things like MakePoint(-105, 40, 4326) and ST_EnvIntersects(geom, -105, 40, -105, 40), just in case. It seems like -spat uses the spatial index efficiently but -sql doesn't. I also tried using them both, but having the -sql there seems to make it slow.

I also tried to add "explain query plan" to the sql statement:

---
ogrinfo -ro /vsis3/prod-tilerepo-va2/metadata/20250920T041443/geoms.gpkg geoms -sql "EXPLAIN QUERY PLAN select * from geoms where ST_Intersects(geom, ST_GeomFromText('POINT(-105 40)', 4326)) = 1"
---

That only sort of confirms the scan, and reports "SCAN geoms" in the output. This page, https://sqlite.org/eqp.html, explains the "explain query plan" output and I was hoping to see a "SEARCH geoms USING INDEX ___" of some sort.

Is there something about the use of -sql that precludes it from using the spatial index? Maybe we need to do an ogr2ogr with just -spat to narrow it down, then apply a -sql to that result if we need additional filtering?

Thanks

_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

--
http://www.spatialys.com
My software is free, but my time generally not.

_______________________________________________
gdal-dev mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/gdal-dev

Reply via email to