The second note on the documentation page you referenced: "Index only kicks in if one of the 
geometries is a constant (not in a subquery/cte). e.g. 'SRID=3005;POINT(1011102 450541)'::geometry 
instead of a.geom".  Neither value passed to "<->" in your query is a constant.

On 6/22/22 09:22, Lars Aksel Opsahl wrote:
Hi

I have two simple tables.

Table"pg_temp_30.g1temp"
Column|Type| Collation | Nullable|Default
--------+------------------------+-----------+----------+------------------------------------
geo| geometry(Polygon,4258) | ||
id | integer| | notnull| nextval('g1temp_id_seq'::regclass)
Indexes:
"g1temp_pkey"PRIMARYKEY, btree(id)
"g1temp_geo_idx"gist(geo)


Table"pg_temp_30.g2temp"
Column|Type| Collation | Nullable|Default
--------+------------------------+-----------+----------+------------------------------------
geo| geometry(Polygon,4258) | ||
id | integer| | notnull| nextval('g2temp_id_seq'::regclass)
Indexes:
"g2temp_pkey"PRIMARYKEY, btree(id)
"g2temp_geo_idx"gist(geo)


And I need to find the polygon in g2temp that has the longest distance to any 
polygon in d1temp.
g1temp has 542 rows and table g2temp has 63483 rows. All polygons are simple 
rectangles.

I run analyze
analyzeg1temp;
analyzeg2temp;


And then run this query

EXPLAINANALYZE
WITHindex_query AS(
SELECTg2.geo, ST_Distance(g1.geo, g2.geo,true) ASd, 1 asrunnning_jos
FROM
g1temp ASg1,
g2temp ASg2
ORDERBYg1.geo<-> g2.geodesc
LIMIT1
)
SELECT*
FROMindex_query;

And I get this

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on index_query(cost=43613127.36..43613153.64 rows=1 width=132) 
(actual time=69029.773..69029.775 rows=1 loops=1)
->Limit(cost=43613127.36..43613153.63 rows=1 width=140) (actual 
time=69029.771..69029.772 rows=1 loops=1)
->Result(cost=43613127.36..947419646.12 rows=34407786 width=140) (actual 
time=69029.769..69029.770 rows=1 loops=1)
->Sort(cost=43613127.36..43699146.83 rows=34407786 width=292) (actual 
time=69019.575..69019.576 rows=1 loops=1)
Sort Key: ((g1.geo <-> g2.geo)) DESC
Sort Method: top-N heapsortMemory: 29kB
->Nested Loop(cost=0.00..43441088.43 rows=34407786 width=292) (actual 
time=0.184..58446.116 rows=34407786 loops=1)
->Seq Scan on g2temp g2(cost=0.00..1245.33 rows=63483 width=120) (actual 
time=0.020..31.744 rows=63483 loops=1)
->Materialize(cost=0.00..14.63 rows=542 width=160) (actual time=0.000..0.030 
rows=542 loops=63483)
->Seq Scan on g1temp g1(cost=0.00..11.92 rows=542 width=160) (actual 
time=0.014..0.202 rows=542 loops=1)
Planning Time: 0.320 ms
Execution Time: 69029.872 ms
(12 rows)​

I do understand that this may take time, but I wonder why I see this "Seq scan" 
both on g1 and g2 ?

From what I read on https://postgis.net/docs/geometry_distance_knn.html it 
should use indexes  in some way.

Have also tested with casting geography when creating table g1 and g2 and that 
did not make any difference.

Tested also with big workmem (1000 MB ) and the result was the same.

I am running on POSTGIS="3.3.0dev 3.1.0alpha2-1532-gc8eedf3ae" [EXTENSION] PGSQL="120" GEOS="3.9.0-CAPI-1.16.2" 
PROJ="7.2.1" LIBXML="2.9.10" LIBJSON="0.13.1" LIBPROTOBUF="1.3.3" WAGYU="0.5.0 (Internal)" TOPOLOGY
(1 row)

Thanks.

Lars

_______________________________________________
postgis-users mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/postgis-users
_______________________________________________
postgis-users mailing list
[email protected]
https://lists.osgeo.org/mailman/listinfo/postgis-users

Reply via email to