Re: PostGIS 3.5.2 ST_Contains() query on gist is slow

Daryl Herzmann Tue, 09 Sep 2025 07:14:39 -0700

Greetings,

I suspect that my comment is not all that helpful, but this general
case of postgres preferring the much slower ST_Contains vs using the
first two query conditions and then applying ST_Contains is something
I have greatly struggled with over recent releases.  My hacky solution
is to use a CTE with OFFSET 0 to force the order of query conditions
applied.


```
with mycte as (
  SELECT wn.way_id, n.geom FROM way_nodes AS wn, nodes AS n WHERE
wn.node_id = 8646506472 AND wn.node_id = n.id OFFSET 0)

select way_id from mycte where
ST_Contains(ST_GeomFromEWKT('SRID=4326;POLYGON((6.1198199
52.612195500000006,6.1198199 53.2038323,7.0927397 53.2038323,7.0927397
52.612195500000006,6.1198199 52.612195500000006))'), geom);
```

It is critical to use OFFSET 0, or the query planner will collapse
this entire query into the previous poor performing one.  So I guess I
am at least curious what happens when you run this CTE example.

daryl

On Tue, Sep 9, 2025 at 1:31 AM Sebastiaan Couwenberg via postgis-users
<postgis-users@lists.osgeo.org> wrote:
>
> On 9/9/25 7:31 AM, Regina Obe wrote:
> >> On 9/9/25 6:27 AM, Sebastiaan Couwenberg via postgis-users wrote:
> >> I wasn't aware of the @ operator, looking at the documentation shouldn't 
> >> the
> >> equivalent of ST_Contains() be ~:
> >
> > @ and ~ are commutators of each other, but neither is equivalent to 
> > ST_Contains as they are both bounding box operators.
> > I think we always use @ for the short-hand of index check and I was trying 
> > to match that
> > I think I had my logic backwards, so I would expect the below to be slow as 
> > well if your ~ is slow.
> >
> > SELECT wn.way_id FROM
> > way_nodes
> > AS wn, nodes AS n WHERE wn.node_id = 8646506472 AND wn.node_id =
> >   n.id
> > AND  n.geom @ ST_GeomFromEWKT('SRID=4326;POLYGON((6.1198199
> > 52.612195500000006,6.1198199 53.2038323,7.0927397
> > 53.2038323,7.0927397 52.612195500000006,6.1198199
> > 52.612195500000006))');
>
> That is also slow indeed:
>
>                                                                               
>                                     QUERY PLAN
> ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>   Nested Loop  (cost=0.99..17.04 rows=1 width=8) (actual 
> time=4845.847..4845.849 rows=0 loops=1)
>     Buffers: shared hit=6876283 read=713674
>     ->  Index Scan using idx_way_nodes_node_id on way_nodes wn  
> (cost=0.57..8.59 rows=1 width=16) (actual time=0.046..0.048 rows=1 loops=1)
>           Index Cond: (node_id = '8646506472'::bigint)
>           Buffers: shared hit=1 read=4
>     ->  Index Scan using idx_nodes_geom on nodes n  (cost=0.42..8.44 rows=1 
> width=8) (actual time=4845.797..4845.797 rows=0 loops=1)
>           Index Cond: (geom @ 
> '0103000020E61000000100000005000000D5A35F11B27A184013A1116C5C4E4A40D5A35F11B27A18402A2F432D179A4A40D529EA27F75E1C402A2F432D179A4A40D529EA27F75E1C4013A1116C5C4E4A40D5A35F11B27A184013A1116C5C4E4A40'::geometry)
>           Filter: (id = '8646506472'::bigint)
>           Rows Removed by Filter: 9886042
>           Buffers: shared hit=6876282 read=713670
>   Planning:
>     Buffers: shared hit=146 read=15
>   Planning Time: 13.788 ms
>   Execution Time: 4845.945 ms
> (14 rows)
>
> > Have you done a
> >
> > vacuum analyze way_nodes;
> > vacuum analyze nodes;
>
> Yes, Osmosis does this.
>
> Running those queries manually again does not improve the performance.
>
> > Often in these kind of cases when one side is always filtering by an id I 
> > use a btree_gist index.
> >
> > What happens if you do:
> >
> > CREATE EXTENSION btree_gist;
> >
> > CREATE INDEX ix_id_nodes_geom ON nodes USING gist(id, geom);
> >
> > As I am guessing you are always matching by node_id and you'd want to force 
> > that match first before filtering by geometry containment which would 
> > always be slower.
>
> The query in question is executed on all the nodes that are part of boundary 
> ways to find (non-boundary) ways that are connected to those boundaries.
>
> Most of these queries will get 0 results because most of those nodes are not 
> connected to other ways.
>
> The btree_gist index does not significantly speedup the query, because it 
> doesn't get used:
>
> osm-nl=# EXPLAIN (ANALYZE, BUFFERS) SELECT wn.way_id FROM way_nodes AS wn, 
> nodes AS n WHERE wn.node_id = 8646506472 AND wn.node_id = n.id AND 
> ST_Contains(ST_GeomFromEWKT('SRID=4326;POLYGON((6.1198199 
> 52.612195500000006,6.1198199 53.2038323,7.0927397 53.2038323,7.0927397 
> 52.612195500000006,6.1198199 52.612195500000006))'), n.geom);
>                                                                               
>                                                         QUERY PLAN
> -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
>   Nested Loop  (cost=0.99..29.54 rows=1 width=8) (actual 
> time=5046.293..5046.294 rows=0 loops=1)
>     Buffers: shared hit=6876048 read=713885 written=5708
>     ->  Index Scan using idx_way_nodes_node_id on way_nodes wn  
> (cost=0.57..8.59 rows=1 width=16) (actual time=0.220..0.223 rows=1 loops=1)
>           Index Cond: (node_id = '8646506472'::bigint)
>           Buffers: shared read=5
>     ->  Index Scan using idx_nodes_geom on nodes n  (cost=0.42..20.94 rows=1 
> width=8) (actual time=5046.068..5046.068 rows=0 loops=1)
>           Index Cond: (geom @ 
> '0103000020E61000000100000005000000D5A35F11B27A184013A1116C5C4E4A40D5A35F11B27A18402A2F432D179A4A40D529EA27F75E1C402A2F432D179A4A40D529EA27F75E1C4013A1116C5C4E4A40D5A35F11B27A184013A1116C5C4E4A40'::geometry)
>           Filter: ((id = '8646506472'::bigint) AND 
> st_contains('0103000020E61000000100000005000000D5A35F11B27A184013A1116C5C4E4A40D5A35F11B27A18402A2F432D179A4A40D529EA27F75E1C402A2F432D179A4A40D529EA27F75E1C4013A1116C5C4E4A40D5A35F11B27A184013A1116C5C4E4A40'::geometry,
>  geom))
>           Rows Removed by Filter: 9886042
>           Buffers: shared hit=6876048 read=713880 written=5708
>   Planning:
>     Buffers: shared hit=54 read=3 dirtied=1
>   Planning Time: 0.418 ms
>   Execution Time: 5046.317 ms
> (14 rows)
>
> Kind Regards,
>
> Bas
>
> --
>   GPG Key ID: 4096R/6750F10AE88D4AF1
> Fingerprint: 8182 DE41 7056 408D 6146  50D1 6750 F10A E88D 4AF1

Re: PostGIS 3.5.2 ST_Contains() query on gist is slow

Reply via email to