I looked at our list of items between 3.1.7 and 3.1.8 

 

https://git.osgeo.org/gitea/postgis/postgis/src/branch/stable-3.1/NEWS

 

and can’t find any that would cause what you describe below.

It’s possible whatever was changed we didn’t consider strong enough to put in 
the news.

 

I haven’t checked the git repo yet to see what changed.

 

Usually these switches in plans are often triggered by changes in costs of the 
functions or just data and it is possible we adjusted some of the costs on 
these functions.

 

I was thinking the 10 to 14 switch might have been caused by the function 
instrumentation changes.

But that wouldn’t explain your experiencing the same issue on 14 3.1.7 vs 14 
3.1.8

 

If you still have both running, can you give us the output 

 

SELECT postgis_full_version(), version();

 

So we can confirm we are truly comparing only a difference in PostGIS version 
and not the libraries used or PostgreSQL minor version.

 

Thanks,

Regina

 

 

From: postgis-users [mailto:[email protected]] On Behalf Of 
Anderson Mallmann
Sent: Monday, July 17, 2023 10:23 AM
To: [email protected]; [email protected]; 
[email protected]
Subject: [postgis-users] Problem with Nested Loop in a query with postgis

 

Hi,

Here at the company I work for, we upgraded a postgres database from version 10 
to 14 and postgis 2.5.3 to 3.3.3 and in a specific query we noticed a huge 
performance decrease between these versions. This query was running less than 
1s and now it takes more than 50s.
Now we are trying to figure out what the real problem is and in which version 
this started to happen. Some of our discoveries so far:

1) The query runs fine until postgis 3.1.7
2) In postgis 3.1.8 the query execution changed from Merge Semi Join to Nested 
Loop Semi Join.
3) Running this query on postgis >= 3.1.8 disabling nested loop with “set 
enable_nestloop = false” the query runs as before without any problem.

We don’t know why the plan changed in postgis >= 3.1.8 to use Nested Loop, but 
even in earlier versions if any query like this uses nested loop, the 
performance drops.

Query:
SELECT pols.pol_id, pols.type, kml, json_build_object('type', 'Feature', 
'geometry', st_asgeojson(pols.geom)::jsonb) as geoJson, pols.status pol_status, 
pols.owner_id pol_owner, pols.details, pols.creation_date as pol_creation_date, 
pols.country as country, pols.state as state, pols.city as city, pols.label as 
label FROM polygons pols WHERE pols.owner_id is null AND pols.status = 'ACTIVE' 
AND pols.type in ('REGION') AND pols.details->>'state' in (SELECT s.state FROM 
states s WHERE ST_DWITHIN(Geography(ST_Transform(s.geom,4326)), 
ST_GeographyFromText('POINT(-43.113826 -22.9022)'),'15000')) AND 
(ST_DWithin(ST_GeographyFromText('POINT(-43.113826 -22.9022)'), 
Geography(pols.geom),'15000')) AND 
(ST_MaxDistance(ST_GeomFromText('POINT(-43.113826 -22.9022)', 4326), pols.geom) 
* 111195) <= '15000'

Here are the explains for this query:

pg 14 - postgis 3.1.7
 Merge Semi Join  (cost=15.83..16.04 rows=1 width=263) (actual 
time=529.041..586.241 rows=416 loops=1)
   Output: pols.pol_id, pols.type, pols.kml, json_build_object('type', 
'Feature', 'geometry', (st_asgeojson(pols.geom, 9, 8))::jsonb), pols.status, 
pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, 
pols.city, pols.label
   Merge Cond: (((pols.details ->> 'state'::text)) = (s.state)::text)
   Buffers: shared hit=3193
   ->  Sort  (cost=4.49..4.50 rows=1 width=4656) (actual time=244.333..244.367 
rows=416 loops=1)
         Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status, 
pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, 
pols.city, pols.label, ((pols.details ->> 'state'::text))
         Sort Key: ((pols.details ->> 'state'::text))
         Sort Method: quicksort  Memory: 1314kB
         Buffers: shared hit=2131
         ->  Index Scan using polygons_geography_state_idx on region.polygons 
pols  (cost=0.28..4.48 rows=1 width=4656) (actual time=10.207..243.079 rows=416 
loops=1)
               Output: pols.pol_id, pols.type, pols.kml, pols.geom, 
pols.status, pols.owner_id, pols.details, pols.creation_date, pols.country, 
pols.state, pols.city, pols.label, (pols.details ->> 'state'::text)
               Index Cond: (geography(pols.geom) && 
'0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography)
               Filter: 
(('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography && 
_st_expand(geography(pols.geom), '15000'::double precision)) AND 
_st_dwithin('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, 
geography(pols.geom), '15000'::double precision, true) AND 
((_st_maxdistance('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geometry,
 st_convexhull(pols.geom)) * '111195'::double precision) <= '15000'::double 
precision))
               Rows Removed by Filter: 395
               Buffers: shared hit=2125
   ->  Sort  (cost=11.34..11.36 rows=9 width=3) (actual time=284.570..284.570 
rows=1 loops=1)
         Output: s.state
         Sort Key: s.state
         Sort Method: quicksort  Memory: 25kB
         Buffers: shared hit=1062
         ->  Seq Scan on region.states s  (cost=0.00..11.19 rows=9 width=3) 
(actual time=180.759..284.564 rows=1 loops=1)
               Output: s.state
               Filter: st_dwithin(geography(st_transform(s.geom, 4326)), 
'0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, 
'15000'::double precision, true)
               Rows Removed by Filter: 26
               Buffers: shared hit=1062
 Query Identifier: -6178717669455780324
 Planning:
   Buffers: shared hit=481
 Planning Time: 31.139 ms
 Execution Time: 586.869 ms

-----------

pg 14 - postgis 3.1.8
 Nested Loop Semi Join  (cost=0.67..111.50 rows=1 width=263) (actual 
time=179.622..59918.477 rows=416 loops=1)
   Output: pols.pol_id, pols.type, pols.kml, json_build_object('type', 
'Feature', 'geometry', (st_asgeojson(pols.geom, 9, 8))::jsonb), pols.status, 
pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, 
pols.city, pols.label
   Join Filter: ((pols.details ->> 'state'::text) = (s.state)::text)
   Buffers: shared hit=72116 read=1075
   ->  Index Scan using polygons_geography_state_idx on region.polygons pols  
(cost=0.41..55.77 rows=1 width=4656) (actual time=9.477..221.758 rows=416 
loops=1)
         Output: pols.pol_id, pols.type, pols.kml, pols.geom, pols.status, 
pols.owner_id, pols.details, pols.creation_date, pols.country, pols.state, 
pols.city, pols.label, pols.last_updated
         Index Cond: (geography(pols.geom) && 
_st_expand('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, 
'15000'::double precision))
         Filter: 
(st_dwithin('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, 
geography(pols.geom), '15000'::double precision, true) AND 
((_st_maxdistance('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geometry,
 st_convexhull(pols.geom)) * '111195'::double precision) <= '15000'::double 
precision))
         Rows Removed by Filter: 395
         Buffers: shared hit=269 read=692
   ->  Index Scan using states_geography_idx on region.states s  
(cost=0.26..54.37 rows=1 width=3) (actual time=142.676..142.676 rows=1 
loops=416)
         Output: s.state_code, s.state, s.country, s.geom
         Index Cond: (geography(st_transform(s.geom, 4326)) && 
_st_expand('0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, 
'15000'::double precision))
         Filter: st_dwithin(geography(st_transform(s.geom, 4326)), 
'0101000020E61000009EB7B1D9918E45C082734694F6E636C0'::geography, 
'15000'::double precision, true)
         Rows Removed by Filter: 1
         Buffers: shared hit=70988 read=171
 Query Identifier: -6178717669455780324
 Planning:
   Buffers: shared hit=401 read=45
 Planning Time: 30.729 ms
 Execution Time: 59919.343 ms

Regards,
Anderson M.

_______________________________________________
postgis-users mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/postgis-users

Reply via email to