https://dbfiddle.uk/qJDW-DjP
On Thu, Nov 9, 2023 at 9:11 AM Cameron McCloud <[email protected]> wrote: > Here's the EXPLAIN for the second query with the join and WHERE on > ST_DWithin: > > Nested Loop (cost=0.00..403093.00 rows=2 width=20) > Join Filter: (b.id = c.building_id) > -> Seq Scan on customer c (cost=0.00..231.00 rows=16000 width=4) > -> Materialize (cost=0.00..402382.01 rows=2 width=20) > -> Seq Scan on building b (cost=0.00..402382.00 rows=2 width=20) > Filter: st_dwithin((st_makepoint((longitude)::double > precision, (latitude)::double precision))::geography, > '0101000020E61000008AE6012CF23158C065A9F57EA39F4040'::geography, > '50000'::double precision, true) > JIT: > Functions: 7 > Options: Inlining false, Optimization false, Expressions true, > Deforming true > > Here's the EXPLAIN ANALYZE: > Nested Loop (cost=0.00..403093.00 rows=2 width=20) (actual > time=29.391..5196.713 rows=6453 loops=1) > Join Filter: (b.id = c.building_id) > Rows Removed by Join Filter: 82424169 > -> Seq Scan on customer c (cost=0.00..231.00 rows=16000 width=4) > (actual time=0.004..2.484 rows=16000 loops=1) > -> Materialize (cost=0.00..402382.01 rows=2 width=20) (actual > time=0.000..0.149 rows=5152 loops=16000) > -> Seq Scan on building b (cost=0.00..402382.00 rows=2 > width=20) (actual time=3.292..28.549 rows=6453 loops=1) > Filter: st_dwithin((st_makepoint((longitude)::double > precision, (latitude)::double precision))::geography, > '0101000020E61000008AE6012CF23158C065A9F57EA39F4040'::geography, > '50000'::double precision, true) > Rows Removed by Filter: 9547 > Planning Time: 0.182 ms > JIT: > Functions: 7 > Options: Inlining false, Optimization false, Expressions true, > Deforming true > Timing: Generation 0.256 ms, Inlining 0.000 ms, Optimization 0.158 ms, > Emission 3.133 ms, Total 3.547 ms > Execution Time: 5197.586 ms > > Here's the EXPLAIN for the 3rd query (JOIN but convert ST_DWithin to > integer) > > Hash Join (cost=431.00..402854.10 rows=80 width=20) > Hash Cond: (b.id = c.building_id) > -> Seq Scan on building b (cost=0.00..402422.00 rows=80 width=20) > Filter: (CASE st_dwithin((st_makepoint((longitude)::double > precision, (latitude)::double precision))::geography, > '0101000020E61000008AE6012CF23158C065A9F57EA39F4040'::geography, > '50000'::double precision, true) WHEN CASE_TEST_EXPR THEN 1 ELSE 0 END = 1) > -> Hash (cost=231.00..231.00 rows=16000 width=4) > -> Seq Scan on customer c (cost=0.00..231.00 rows=16000 width=4) > JIT: > Functions: 12 > Options: Inlining false, Optimization false, Expressions true, > Deforming true > > And the EXPLAIN ANALYZE: > > Hash Join (cost=431.00..402854.10 rows=80 width=20) (actual > time=6.470..33.691 rows=6453 loops=1) > Hash Cond: (b.id = c.building_id) > -> Seq Scan on building b (cost=0.00..402422.00 rows=80 width=20) > (actual time=4.280..30.210 rows=6453 loops=1) > Filter: (CASE st_dwithin((st_makepoint((longitude)::double > precision, (latitude)::double precision))::geography, > '0101000020E61000008AE6012CF23158C065A9F57EA39F4040'::geography, > '50000'::double precision, true) WHEN CASE_TEST_EXPR THEN 1 ELSE 0 END = 1) > Rows Removed by Filter: 9547 > -> Hash (cost=231.00..231.00 rows=16000 width=4) (actual > time=2.173..2.174 rows=16000 loops=1) > Buckets: 16384 Batches: 1 Memory Usage: 691kB > -> Seq Scan on customer c (cost=0.00..231.00 rows=16000 > width=4) (actual time=0.007..0.931 rows=16000 loops=1) > Planning Time: 0.132 ms > JIT: > Functions: 12 > Options: Inlining false, Optimization false, Expressions true, > Deforming true > Timing: Generation 0.311 ms, Inlining 0.000 ms, Optimization 0.184 ms, > Emission 4.085 ms, Total 4.580 ms > Execution Time: 34.239 ms > > > > > On Wed, Nov 8, 2023 at 6:10 PM Regina Obe <[email protected]> wrote: > >> First of all is that really your join clause? You seem to be missing an >> ON >> >> >> >> from building b >> >> join customer c >> >> >> >> >> >> Also please output >> >> >> >> EXPLAIN >> >> >> >> And the >> >> >> >> EXPLAIN ANALYZE of each query >> >> >> >> *From:* postgis-users <[email protected]> *On Behalf >> Of *Cameron McCloud via postgis-users >> *Sent:* Wednesday, November 8, 2023 5:38 AM >> *To:* [email protected] >> *Cc:* Cameron McCloud <[email protected]> >> *Subject:* Re: [postgis-users] ST_DWithin slow if query is joined to >> another table, but fast if ST_DWithin cast to an integer >> >> >> >> Test code is here: >> https://github.com/cameronmccloud/postgis_slow_st_dwithin/blob/main/test.sql >> >> >> >> On Wed, Nov 8, 2023 at 9:25 AM Cameron McCloud <[email protected]> >> wrote: >> >> Hi, >> >> >> >> Postgis Version: 3.3.4 >> >> Postgres Version: 14.9 >> >> >> >> We have 2 tables, "building" with lat/long and "customer" with a FK to >> building. There's a 1:1 relationship between the two. >> >> >> >> The test tables we're using have 16K rows each. Our production data has a >> lot more, but we could reproduce this on a smaller dataset. >> >> >> >> We found some odd behaviour when using ST_DWITHIN in a WHERE clause but >> only when the "building" table is joined to the "customer" table. >> >> >> >> We also found that converting the result of ST_DWITHIN to an integer (1/0 >> for true/false) and using the integer in the WHERE is fast. >> >> >> >> -- This query works as expected and takes 60ms returning 6K rows. >> >> select b.* >> >> from building b >> >> where >> >> st_dwithin(st_makepoint(b.longitude, b.latitude)::geography, >> st_makepoint(-96.7804060, 33.2471770)::geography, 50000); >> >> >> >> -- This query is orders of magnitude slower - 3000ms, even though joining >> the two tables without the WHERE takes 30ms >> >> select b.* >> >> from building b >> >> join customer c >> >> where >> >> st_dwithin(st_makepoint(b.longitude, b.latitude)::geography, >> st_makepoint(-96.7804060, 33.2471770)::geography, 50000); >> >> >> >> -- This query converts the result of ST_DWITHIN to an integer. It's fast >> and takes 80ms >> >> select b.* >> >> from building b >> >> join customer c >> >> where >> >> case st_dwithin(st_makepoint(b.longitude, b.latitude)::geography, >> st_makepoint(-96.7804060, 33.2471770)::geography, 50000) >> >> when true then 1 >> >> else 0 >> >> end = 1; >> >> >> >> We have no idea why this is the case, but we're curious. In our >> production scenario using the "case...when" brought a query down from 6 >> minutes to 6 seconds. We'd love to know why this might be and if there are >> other scenarios like this that could increase the performance of our >> Postgis queries. >> >> >> >> Thanks, >> >> >> >> Cam. >> >> >> >> >> >> >> >> >> >>
_______________________________________________ postgis-users mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/postgis-users
