Hi Paul, Thanks for the response. I believe this all is a goose chase, I am sorry. I copied my data to a test postgis 3.4.3 instance and noticed little change in speed... My "reproducer" is a part of a larger query, that previous versions of PostgreSQL (v16) were coming up with a better query plan for and thus sending orders of magnitude fewer geometries at ST_Contains. I mistakenly thought postgis was being slow... It frustratingly looks like so:
postgis=# explain (analyze, buffers) select count(*) from spc_outlooks where issue > '2024-05-01' and issue < '2024-05-04' ; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ Aggregate (cost=230.66..230.67 rows=1 width=8) (actual time=0.368..0.369 rows=1 loops=1) Buffers: shared hit=339 read=3 -> Nested Loop Left Join (cost=0.84..230.21 rows=177 width=0) (actual time=0.070..0.345 rows=388 loops=1) Buffers: shared hit=339 read=3 -> Index Scan using spc_outlook_issue_idx on spc_outlook o (cost=0.42..84.75 rows=91 width=4) (actual time=0.028..0.076 rows=105 loops=1) Index Cond: ((issue > '2024-05-01 00:00:00-05'::timestamp with time zone) AND (issue < '2024-05-04 00:00:00-05'::timestamp with time zone)) Buffers: shared hit=25 -> Index Only Scan using spc_outlook_geometries_idx on spc_outlook_geometries g (cost=0.42..1.52 rows=8 width=4) (actual time=0.002..0.002 rows=3 loops=105) Index Cond: (spc_outlook_id = o.id) Heap Fetches: 2 Buffers: shared hit=314 read=3 Planning: Buffers: shared hit=17 Planning Time: 0.261 ms Execution Time: 0.392 ms postgis=# explain (analyze, buffers) select count(*) from spc_outlooks where issue > '2024-05-01' and issue < '2024-05-04' and st_contains(geom, ST_Point(-95, 42, 4326)); QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Aggregate (cost=17.29..17.30 rows=1 width=8) (actual time=11732.922..11732.924 rows=1 loops=1) Buffers: shared hit=791561 -> Nested Loop (cost=0.70..17.29 rows=1 width=0) (actual time=2132.852..11732.875 rows=60 loops=1) Buffers: shared hit=791561 -> Index Scan using spc_outlook_geometries_gix on spc_outlook_geometries g (cost=0.28..14.80 rows=1 width=4) (actual time=0.191..11541.355 rows=49009 loops=1) Index Cond: (geom ~ '0101000020E61000000000000000C057C00000000000004540'::geometry) Filter: st_contains(geom, '0101000020E61000000000000000C057C00000000000004540'::geometry) Rows Removed by Filter: 116316 Buffers: shared hit=595525 -> Index Scan using spc_outlook_id_key on spc_outlook o (cost=0.42..2.44 rows=1 width=4) (actual time=0.004..0.004 rows=0 loops=49009) Index Cond: (id = g.spc_outlook_id) Filter: ((issue > '2024-05-01 00:00:00-05'::timestamp with time zone) AND (issue < '2024-05-04 00:00:00-05'::timestamp with time zone)) Rows Removed by Filter: 1 Buffers: shared hit=196036 Planning: Buffers: shared hit=17 Planning Time: 0.414 ms Execution Time: 11732.950 ms Sorry again :( daryl On Thu, Dec 26, 2024 at 1:51 PM Paul Ramsey <pram...@cleverelephant.ca> wrote: > > What was previous “fast” version? What were reference performance numbers on > *that* version? > Can you provide the affected data? > That last “small” query looks pretty fast (half a milisecond). > P > > > On Dec 26, 2024, at 9:39 AM, Daryl Herzmann <akrh...@gmail.com> wrote: > > > > Howdy All, > > > > I saw some list traffic about a performance regression introduced with > > postgis 3.5.0 using ST_Within and I figured what I was observing was > > perhaps the same issue. I am now using postgis 3.5.1 and am still > > seeing my performance problem, so I figured I had better chime in here > > to see what I could be doing wrong :) > > > > My environment is Rocky Linux 9 64bit with pgrpms: > > > > POSTGIS="3.5.1 48ab069" [EXTENSION] PGSQL="170" > > GEOS="3.13.0-CAPI-1.19.0" PROJ="9.4.1 NETWORK_ENABLED=OFF > > URL_ENDPOINT=https://cdn.proj.org > > USER_WRITABLE_DIRECTORY=/var/lib/pgsql/.local/s > > hare/proj DATABASE_PATH=/usr/proj94/share/proj/proj.db" (compiled > > against PROJ 9.5.1) LIBXML="2.9.13" LIBJSON="0.14" LIBPROTOBUF="1.3.3" > > WAGYU="0.5.0 (Internal)" > > > > I notice that proj version mismatch, so I am unsure if that is at play or > > not... > > > > A common explain analyze looks like so: > > > > postgis=# explain (analyze, buffers) select count(*) from > > spc_outlook_geometries where st_contains(geom, ST_Point(-95, 42, > > 4326)); > > > > QUERY PLAN > > ------------------------------------------------------------------------------------------------------------------------------------------------------------------ > > Aggregate (cost=14.80..14.81 rows=1 width=8) (actual > > time=11745.230..11745.231 rows=1 loops=1) > > Buffers: shared hit=572500 read=23000 > > -> Index Scan using spc_outlook_geometries_gix on > > spc_outlook_geometries (cost=0.28..14.80 rows=1 width=0) (actual > > time=0.122..11738.404 rows=49009 loops=1) > > Index Cond: (geom ~ > > '0101000020E61000000000000000C057C00000000000004540'::geometry) > > Filter: st_contains(geom, > > '0101000020E61000000000000000C057C00000000000004540'::geometry) > > Rows Removed by Filter: 116312 > > Buffers: shared hit=572500 read=23000 > > Planning Time: 0.172 ms > > Execution Time: 11745.475 ms > > (9 rows) > > > > The relation looks like so. > > > > postgis=# \d spc_outlook_geometries > > Table "public.spc_outlook_geometries" > > Column | Type | Collation | Nullable | > > Default > > ----------------+-----------------------------+-----------+----------+--------- > > spc_outlook_id | integer | | | > > threshold | character varying(4) | | | > > category | character varying(64) | | | > > geom | geometry(MultiPolygon,4326) | | | > > geom_layers | geometry(MultiPolygon,4326) | | | > > Indexes: > > "spc_outlook_geometries_combo_idx" btree (threshold, category) > > "spc_outlook_geometries_gix" gist (geom) > > "spc_outlook_geometries_idx" btree (spc_outlook_id) > > "spc_outlook_geometries_layers_gix" gist (geom_layers) > > Check constraints: > > "_sog_geom_isvalid" CHECK (st_isvalid(geom)) > > "_sog_geom_layers_isvalid" CHECK (st_isvalid(geom_layers)) > > Foreign-key constraints: > > "spc_outlook_geometries_threshold_fkey" FOREIGN KEY (threshold) > > REFERENCES spc_outlook_thresholds(threshold) > > > > Even a "cheaper" query takes considerable time. > > > > postgis=# explain (analyze, buffers) select st_area(geom) from > > spc_outlook_geometries where st_contains(geom, ST_Point(-95, 42, > > 4326)) LIMIT 10; > > > > QUERY PLAN > > ----------------------------------------------------------------------------------------------------------------------------------------------------------- > > Limit (cost=0.28..14.93 rows=1 width=8) (actual time=0.115..0.643 > > rows=10 loops=1) > > Buffers: shared hit=90 > > -> Index Scan using spc_outlook_geometries_gix on > > spc_outlook_geometries (cost=0.28..14.93 rows=1 width=8) (actual > > time=0.115..0.641 rows=10 loops=1) > > Index Cond: (geom ~ > > '0101000020E61000000000000000C057C00000000000004540'::geometry) > > Filter: st_contains(geom, > > '0101000020E61000000000000000C057C00000000000004540'::geometry) > > Rows Removed by Filter: 57 > > Buffers: shared hit=90 > > Planning Time: 0.086 ms > > Execution Time: 0.657 ms > > > > thank you for your time! > > daryl >