Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2017-02-02 Thread Kyotaro HORIGUCHI
I forgot to mention this..

At Thu, 02 Feb 2017 21:11:16 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI 
 wrote in 
<20170202.26.49572782.horiguchi.kyot...@lab.ntt.co.jp>
> Hello, I digged this topic covered by a spiderweb..
> 
> # PostGIS requires so many libraries installed :(
> 
> FIY, for the given test case, the following query hits the bug.
> 
> | SELECT edge.geom AS geom
> | FROM (SELECT * FROM knn_recheck_geom) AS edge
> | ORDER BY '01010014AE47E17A141FC09AA170C0' <-> edge.geom LIMIT 2;
> 
> 
> At Tue, 3 Jan 2017 11:18:55 -0500, Robert Haas  wrote 
> in 
> > On Tue, Jan 3, 2017 at 12:36 AM, Regina Obe  wrote:
> > > It's still not quite clear to me even looking at that git commit, why 
> > > those need to error instead of going thru recheck aside from efficiency.
> > 
> > The code that reorders the returned tuples assumes that (1) the actual
> > distance is always greater than or equal to the estimated distance and
> > (2) the index returns the tuples in order of increasing estimated
> > distance.  Imagine that the estimated distances are 0, 1, 2, 3... and
> > the real distances are 2,3,4,5...  When it sees the
> > estimated-distance-0 tuple it computes that the actual distance is 2,
> > but it doesn't know whether there's going to be a tuple later with an
> > actual distance between 0 and 2, so it buffers the tuple. When it sees
> > the estimated-distance-1 tuple it computes that the actual distance is
> > 2, and now it knows there won't be any more estimated or actual
> > distances between 0 and 1, but there could still be a tuple with an
> > estimated distance of 1 and 2 whose actual distance is also between 1
> > and 2, so it buffers the second tuple as well.  When it sees the third
> > tuple, with estimated distance 2, it now knows that there won't be any
> > further tuples whose estimated or actual distance is less than 2.  So
> > now it can emit the first tuple that it saw, because that had an
> > actual distance of 2 and from this point forward the index will never
> > return anything with a smaller estimated or actual distance.  The
> > estimated-distance-1 tuple still has to stay in the buffer, though,
> > until we see a tuple whose estimated distance is greater than that
> > tuple's actual distance (3).
> 
> The estimation is calculated using box2df_distance, and the
> recheck evaluation is using distnce (of libgeom).
> 
> For the problematic case, the distance's result is
> "6.992439" and box2df_distance's is
> "6.99330517578125" by "%20.18lf". The point should be just on
> the edge of its bounding box.

The binary representaions of the two are the following.

distance:
 "40 1b f8 d4 fd f3 b6 45"
=(0:plus) (101 = 2^(1025-1023) = *4) 1."bf8d4fdf3b645"

estimation:
 "40 1b f8 d5 00 00 00 00"
=(0:plus) (101 = 2^(1025-1023) = *4) 1."bf8d5"

I mean, the estimation (box2df_distance) seems getting precision
reduction on its matissa. (I cannot say more than this, though)


> gserialized_gist_distance_2d() of PostGIS 2.3.0:
> >  /* In all cases, since we only have keys (boxes) we'll return */
> >  /* the minimum possible distance, which is the box2df_distance */
> >  /* and let the recheck sort things out in the case of leaves */
> >  distance = (double)box2df_distance(entry_box, _box);
> 
> This theoretically doesn't give larger value than the real
> distance but it is failing.  I'm not sure how to fix this but at
> least it is not a problem of GIST or PostgreSQL.
> 
> If set the distance() of libgeom as the base, box2df_distance
> might should return a very-very slightly smaller value (I don't
> know how much it should be.) so that it can conseal the error of
> its operation.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2017-02-02 Thread Kyotaro HORIGUCHI
Hello, I digged this topic covered by a spiderweb..

# PostGIS requires so many libraries installed :(

FIY, for the given test case, the following query hits the bug.

| SELECT edge.geom AS geom
| FROM (SELECT * FROM knn_recheck_geom) AS edge
| ORDER BY '01010014AE47E17A141FC09AA170C0' <-> edge.geom LIMIT 2;


At Tue, 3 Jan 2017 11:18:55 -0500, Robert Haas  wrote in 

> On Tue, Jan 3, 2017 at 12:36 AM, Regina Obe  wrote:
> > It's still not quite clear to me even looking at that git commit, why those 
> > need to error instead of going thru recheck aside from efficiency.
> 
> The code that reorders the returned tuples assumes that (1) the actual
> distance is always greater than or equal to the estimated distance and
> (2) the index returns the tuples in order of increasing estimated
> distance.  Imagine that the estimated distances are 0, 1, 2, 3... and
> the real distances are 2,3,4,5...  When it sees the
> estimated-distance-0 tuple it computes that the actual distance is 2,
> but it doesn't know whether there's going to be a tuple later with an
> actual distance between 0 and 2, so it buffers the tuple. When it sees
> the estimated-distance-1 tuple it computes that the actual distance is
> 2, and now it knows there won't be any more estimated or actual
> distances between 0 and 1, but there could still be a tuple with an
> estimated distance of 1 and 2 whose actual distance is also between 1
> and 2, so it buffers the second tuple as well.  When it sees the third
> tuple, with estimated distance 2, it now knows that there won't be any
> further tuples whose estimated or actual distance is less than 2.  So
> now it can emit the first tuple that it saw, because that had an
> actual distance of 2 and from this point forward the index will never
> return anything with a smaller estimated or actual distance.  The
> estimated-distance-1 tuple still has to stay in the buffer, though,
> until we see a tuple whose estimated distance is greater than that
> tuple's actual distance (3).

The estimation is calculated using box2df_distance, and the
recheck evaluation is using distnce (of libgeom).

For the problematic case, the distance's result is
"6.992439" and box2df_distance's is
"6.99330517578125" by "%20.18lf". The point should be just on
the edge of its bounding box.

gserialized_gist_distance_2d() of PostGIS 2.3.0:
>  /* In all cases, since we only have keys (boxes) we'll return */
>  /* the minimum possible distance, which is the box2df_distance */
>  /* and let the recheck sort things out in the case of leaves */
>  distance = (double)box2df_distance(entry_box, _box);

This theoretically doesn't give larger value than the real
distance but it is failing.  I'm not sure how to fix this but at
least it is not a problem of GIST or PostgreSQL.

If set the distance() of libgeom as the base, box2df_distance
might should return a very-very slightly smaller value (I don't
know how much it should be.) so that it can conseal the error of
its operation.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2017-01-03 Thread Robert Haas
On Tue, Jan 3, 2017 at 12:36 AM, Regina Obe  wrote:
>> cmp would return 0 if the estimated distance returned by the index AM were 
>> greater than the actual distance.
>> The estimated distance can be less than the actual distance, but it isn't 
>> allowed to be more.  See gist_bbox_distance for an example of a "lossy" 
>> distance calculation, and more generally "git show 
>> 35fcb1b3d038a501f3f4c87c05630095abaaadab".
>
> Did you mean would return < 0 ?

Yes, sorry.

> Since I thought 0 meant exact and not where it's Erroring?
>
> I think for points then maybe we should turn it off, as this could just be 
> floating point issues with the way we compute the index.
> That would explain why it doesn't happen for other cases like  polygon / 
> point in our code
> or polygon /polygon in our code since the box box distance in our code would 
> always be <= actual distance for those.
>
> So maybe the best course of action is just for us inspect the geometries and 
> if both are points just disable recheck.
>
> It's still not quite clear to me even looking at that git commit, why those 
> need to error instead of going thru recheck aside from efficiency.

The code that reorders the returned tuples assumes that (1) the actual
distance is always greater than or equal to the estimated distance and
(2) the index returns the tuples in order of increasing estimated
distance.  Imagine that the estimated distances are 0, 1, 2, 3... and
the real distances are 2,3,4,5...  When it sees the
estimated-distance-0 tuple it computes that the actual distance is 2,
but it doesn't know whether there's going to be a tuple later with an
actual distance between 0 and 2, so it buffers the tuple. When it sees
the estimated-distance-1 tuple it computes that the actual distance is
2, and now it knows there won't be any more estimated or actual
distances between 0 and 1, but there could still be a tuple with an
estimated distance of 1 and 2 whose actual distance is also between 1
and 2, so it buffers the second tuple as well.  When it sees the third
tuple, with estimated distance 2, it now knows that there won't be any
further tuples whose estimated or actual distance is less than 2.  So
now it can emit the first tuple that it saw, because that had an
actual distance of 2 and from this point forward the index will never
return anything with a smaller estimated or actual distance.  The
estimated-distance-1 tuple still has to stay in the buffer, though,
until we see a tuple whose estimated distance is greater than that
tuple's actual distance (3).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2017-01-02 Thread Regina Obe

>> If things are out of order, why isn't just going to was_exact = false 
>> good enough?
>>
>> I'm not sure if the mistake is in our PostGIS code or something in 
>> PostgreSQL recheck logic.
>> If I change the elog(ERROR ...) to a elog(NOTICE, the answers  are 
>> correct and sort order is right.
>>
>> Under what conditions would cmp return less than 0?  I tried following 
>> the code in cmp_orderbyvals, but got lost and trying to put elog 
>> notices in to see what the distance is returning (I probably did it 
>> wrong), just ended up crashing by backend.

> cmp would return 0 if the estimated distance returned by the index AM were 
> greater than the actual distance.  
> The estimated distance can be less than the actual distance, but it isn't 
> allowed to be more.  See gist_bbox_distance for an example of a "lossy" 
> distance calculation, and more generally "git show 
> 35fcb1b3d038a501f3f4c87c05630095abaaadab".

>--
>Robert Haas
> EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company

Did you mean would return < 0 ? 

Since I thought 0 meant exact and not where it's Erroring?

I think for points then maybe we should turn it off, as this could just be 
floating point issues with the way we compute the index.
That would explain why it doesn't happen for other cases like  polygon / point 
in our code
or polygon /polygon in our code since the box box distance in our code would 
always be <= actual distance for those.

So maybe the best course of action is just for us inspect the geometries and if 
both are points just disable recheck.

It's still not quite clear to me even looking at that git commit, why those 
need to error instead of going thru recheck aside from efficiency.


Thanks,
Regina



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2017-01-02 Thread Robert Haas
On Fri, Dec 30, 2016 at 12:51 AM, Regina Obe  wrote:
> I've been trying to troubleshoot the cause of this PostGIS recheck bug we
> have reported by two people so far.  The last test was a nice simple
> repeatable one that triggered the issue:
>
> https://trac.osgeo.org/postgis/ticket/3418
>
> from what I have seen this only affects cases where we are doing a distance
> check between two points, which we actually don't need to enable recheck for
> anyway, but trying to disable that seems like just shoving the real problem
> under the covers.

Agreed.

> If things are out of order, why isn't just going to was_exact = false good
> enough?
>
> I'm not sure if the mistake is in our PostGIS code or something in
> PostgreSQL recheck logic.
> If I change the elog(ERROR ...) to a elog(NOTICE, the answers  are correct
> and sort order is right.
>
> Under what conditions would cmp return less than 0?  I tried following the
> code in cmp_orderbyvals, but got lost
> and trying to put elog notices in to see what the distance is returning (I
> probably did it wrong), just ended up crashing by backend.

cmp would return 0 if the estimated distance returned by the index AM
were greater than the actual distance.  The estimated distance can be
less than the actual distance, but it isn't allowed to be more.  See
gist_bbox_distance for an example of a "lossy" distance calculation,
and more generally "git show
35fcb1b3d038a501f3f4c87c05630095abaaadab".

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] What is "index returned tuples in wrong order" for recheck supposed to guard against?

2016-12-29 Thread Regina Obe
I've been trying to troubleshoot the cause of this PostGIS recheck bug we
have reported by two people so far.  The last test was a nice simple
repeatable one that triggered the issue:

https://trac.osgeo.org/postgis/ticket/3418


from what I have seen this only affects cases where we are doing a distance
check between two points, which we actually don't need to enable recheck for
anyway, but trying to disable that seems like just shoving the real problem
under the covers.
Where it errors is this line 272 in src/backend/executor/nodeIndexscan

https://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/exe
cutor/nodeIndexscan.c;h=3143bd94ec4499fba94b41693538b785c4b32e6c;hb=HEAD#l27
2


  /*
 259  * Was the ORDER BY value returned by the index accurate?
The
 260  * recheck flag means that the index can return inaccurate
values,
 261  * but then again, the value returned for any particular
tuple
 262  * could also be exactly correct.  Compare the value
returned by
 263  * the index with the recalculated value.  (If the value
returned
 264  * by the index happened to be exact right, we can often
avoid
 265  * pushing the tuple to the queue, just to pop it back out
again.)
 266  */
 267 cmp = cmp_orderbyvals(node->iss_OrderByValues,
 268   node->iss_OrderByNulls,
 269   scandesc->xs_orderbyvals,
 270   scandesc->xs_orderbynulls,
 271   node);
 272 if (cmp < 0)
 273 elog(ERROR, "index returned tuples in wrong order");
 274 else if (cmp == 0)
 275 was_exact = true;
 276 else
 277 was_exact = false;

If things are out of order, why isn't just going to was_exact = false good
enough?

I'm not sure if the mistake is in our PostGIS code or something in
PostgreSQL recheck logic.
If I change the elog(ERROR ...) to a elog(NOTICE, the answers  are correct
and sort order is right.

Under what conditions would cmp return less than 0?  I tried following the
code in cmp_orderbyvals, but got lost
and trying to put elog notices in to see what the distance is returning (I
probably did it wrong), just ended up crashing by backend.


Thanks for any thoughts,
Regina





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers