>From: postgis-users <[email protected]> on behalf of Jim >Klassen <[email protected]> >Sent: Wednesday, June 22, 2022 5:27 PM >To: [email protected] <[email protected]> >Subject: Re: [postgis-users] >https://postgis.net/docs/geometry_distance_knn.html and index usage > >The second note on the documentation page you referenced: "Index only kicks in >if one of the geometries is a constant (not in a subquery/cte). e.g. >'SRID=3005;POINT(1011102 450541)'::geometry instead of a.geom". Neither value >passed to "<->" in your query is a constant. > >On 6/22/22 09:22, Lars Aksel Opsahl wrote:
>From: postgis-users <[email protected]> on behalf of Jim >Klassen <[email protected]> >Sent: Wednesday, June 22, 2022 5:27 PM >To: [email protected] <[email protected]> >Subject: Re: [postgis-users] >https://postgis.net/docs/geometry_distance_knn.html and index usage > >The second note on the documentation page you referenced: "Index only kicks in >if one of the geometries is a constant (not in a subquery/cte). e.g. >'SRID=3005;POINT(1011102 450541)'::geometry instead of a.geom". Neither value >passed to "<->" in your query is a constant. > Hi Thanks but this does not seems to be correct either, look at this sample EXPLAIN ANALYZE WITH index_query AS ( SELECT g2.geo, st_distance(g2.geo, 'SRID=4258;POLYGON((4.952337313368302 58.39771808688609,4.952337313368302 58.6106223597104,5.384743019998486 58.6106223597104,5.384743019998486 58.39771808688609,4.952337313368302 58.39771808688609))'::geometry) as d FROM g2temp AS g2 ORDER BY 'SRID=4258;POLYGON((4.952337313368302 58.39771808688609,4.952337313368302 58.6106223597104,5.384743019998486 58.6106223597104,5.384743019998486 58.39771808688609,4.952337313368302 58.39771808688609))'::geometry <-> g2.geo desc LIMIT 1 ) SELECT * FROM index_query; We still have a sequence scan Subquery Scan on index_query (cost=80916.49..80942.77 rows=1 width=40) (actual time=106.185..106.187 rows=1 loops=1) -> Limit (cost=80916.49..80942.76 rows=1 width=48) (actual time=106.184..106.185 rows=1 loops=1) -> Result (cost=80916.49..1748138.78 rows=63483 width=48) (actual time=106.181..106.182 rows=1 loops=1) -> Sort (cost=80916.49..81075.20 rows=63483 width=40) (actual time=106.178..106.179 rows=1 loops=1) Sort Key: (('0103000020A210000001000000050000009EB53E8331CF13400A1A206DE8324D409EB53E8331CF1340CD949CDF284E4D40F5070113FA891540CD949CDF284E4D40F5070113FA8915400A1A206DE8324D409EB53E8331CF13400A1A206DE8324D40'::geometry <-> g2.geo)) DESC Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on g2temp g2 (cost=0.00..80599.08 rows=63483 width=40) (actual time=0.061..90.173 rows=63483 loops=1) Planning Time: 0.225 ms Execution Time: 106.231 ms (9 rows) Then we do test with cross join lateral, but we do ORDER BY dist desc EXPLAIN ANALYZE SELECT g1.id AS g1_id, g2.id AS g1_id, g2.geo::geometry(Polygon,4258) AS street_geo, g2.dist FROM g1temp g1 CROSS JOIN LATERAL ( SELECT g2.geo, g2.id, g2.geo <-> g1.geo AS dist FROM g2temp AS g2 ORDER BY dist desc LIMIT 1 ) g2 LIMIT 1; And we still have a sequence scan QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=80916.49..161683.74 rows=1 width=48) (actual time=6576.006..6576.008 rows=1 loops=1) -> Nested Loop (cost=80916.49..43856764.40 rows=542 width=48) (actual time=6576.004..6576.005 rows=1 loops=1) -> Seq Scan on g1temp g1 (cost=0.00..11.92 rows=542 width=36) (actual time=0.017..0.018 rows=1 loops=1) -> Limit (cost=80916.49..80916.50 rows=1 width=44) (actual time=6575.981..6575.981 rows=1 loops=1) -> Sort (cost=80916.49..81075.20 rows=63483 width=44) (actual time=6575.979..6575.979 rows=1 loops=1) Sort Key: ((g2.geo <-> g1.geo)) DESC Sort Method: top-N heapsort Memory: 25kB -> Seq Scan on g2temp g2 (cost=0.00..80599.08 rows=63483 width=44) (actual time=0.380..6556.206 rows=63483 loops=1) Planning Time: 0.255 ms Execution Time: 6576.056 ms (10 rows) But with we get an index scan if we do order by "dist asc" and not "dist desc" as above EXPLAIN ANALYZE SELECT g1.id AS g1_id, g2.id AS g1_id, g2.geo::geometry(Polygon,4258) AS street_geo, g2.dist FROM g1temp g1 CROSS JOIN LATERAL ( SELECT g2.geo, g2.id, g2.geo <-> g1.geo AS dist FROM g2temp AS g2 ORDER BY dist asc LIMIT 1 ) g2 LIMIT 1; QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=0.28..1.89 rows=1 width=48) (actual time=39.242..39.244 rows=1 loops=1) -> Nested Loop (cost=0.28..875.21 rows=542 width=48) (actual time=39.240..39.241 rows=1 loops=1) -> Seq Scan on g1temp g1 (cost=0.00..11.92 rows=542 width=36) (actual time=0.017..0.017 rows=1 loops=1) -> Limit (cost=0.28..1.57 rows=1 width=44) (actual time=39.219..39.219 rows=1 loops=1) -> Index Scan using g2temp_geo_idx on g2temp g2 (cost=0.28..82069.98 rows=63483 width=44) (actual time=39.185..39.185 rows=1 loops=1) Order By: (geo <-> g1.geo) Planning Time: 0.310 ms Execution Time: 39.300 ms (8 rows) I have not checked the results, but it seems like indexes only can used when locking for the nearest and not the one farthest away as I am locking for. Thanks. Lars
_______________________________________________ postgis-users mailing list [email protected] https://lists.osgeo.org/mailman/listinfo/postgis-users
