I can confirm that this two-month old email report still produces different results with indexes on/off in git master, which I don't think is ever correct behavior.
--------------------------------------------------------------------------- On Wed, Jun 24, 2020 at 03:11:03PM -0700, Jesse Zhang wrote: > Hi hackers, > > While working with Chris Hajas on merging Postgres 12 with Greenplum > Database we stumbled upon the following strange behavior in the geometry > type polygon: > > ------ >8 -------- > > CREATE TEMP TABLE foo (p point); > CREATE INDEX ON foo USING gist(p); > > INSERT INTO foo VALUES ('0,0'), ('1,1'), ('NaN,NaN'); > > SELECT $q$ > SELECT * FROM foo WHERE p <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)' > $q$ AS qry \gset > > BEGIN; > SAVEPOINT yolo; > SET LOCAL enable_seqscan TO off; > :qry; > > ROLLBACK TO SAVEPOINT yolo; > SET LOCAL enable_indexscan TO off; > SET LOCAL enable_bitmapscan TO off; > :qry; > > ------ 8< -------- > > If you run the above repro SQL in HEAD (and 12, and likely all older > versions), you get the following output: > > CREATE TABLE > CREATE INDEX > INSERT 0 3 > BEGIN > SAVEPOINT > SET > p > ------- > (0,0) > (1,1) > (2 rows) > > ROLLBACK > SET > SET > p > ----------- > (0,0) > (1,1) > (NaN,NaN) > (3 rows) > > > At first glance, you'd think this is the gist AM's bad, but on a second > thought, something else is strange here. The following query returns > true: > > SELECT point '(NaN, NaN)' <@ polygon '(0,0), (0, 100), (100, 100), (100, 0)' > > The above behavior of the "contained in" operator is surprising, and > it's probably not what the GiST AM is expecting. I took a look at > point_inside() in geo_ops.c, and it doesn't seem well equipped to handle > NaN. Similary ill-equipped is dist_ppoly_internal() which underlies the > distnace operator for polygon. It gives the following interesting > output: > > SELECT *, c <-> polygon '(0,0),(0,100),(100,100),(100,0)' as distance > FROM ( > SELECT circle(point(100 * i, 'NaN'), 50) AS c > FROM generate_series(-2, 4) i > ) t(c) > ORDER BY 2; > > c | distance > -----------------+---------- > <(-200,NaN),50> | 0 > <(-100,NaN),50> | 0 > <(0,NaN),50> | 0 > <(100,NaN),50> | 0 > <(200,NaN),50> | NaN > <(300,NaN),50> | NaN > <(400,NaN),50> | NaN > (7 rows) > > Should they all be NaN? Am I alone in thinking the index is right but > the operators are wrong? Or should we call the indexes wrong here? > > Cheers, > Jesse and Chris > > -- Bruce Momjian <br...@momjian.us> https://momjian.us EnterpriseDB https://enterprisedb.com The usefulness of a cup is in its emptiness, Bruce Lee