Re: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: [HACKERS] Add a filed to PageHeaderData)

2014-07-16 Thread Heikki Linnakangas

On 07/16/2014 08:30 AM, Michael Paquier wrote:

On Wed, Jul 16, 2014 at 1:34 PM, Pavan Deolasee
pavan.deola...@gmail.com wrote:

Heikki, did you get chance to commit your patch? IMHO we should get the bug
fix in before minor releases next week. My apologies if you've already
committed it and I've missed the commit message.

FWIW, this patch has not been committed yet. I am not seeing any
recent update on src/backend/utils/adt/rangetypes_spgist.c.


Thanks for the reminder, committed now.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: [HACKERS] Add a filed to PageHeaderData)

2014-07-15 Thread Pavan Deolasee
On Wed, Jul 2, 2014 at 11:11 AM, Pavan Deolasee pavan.deola...@gmail.com
wrote:

 On Wed, Jun 25, 2014 at 10:39 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com wrote:


 I came up with the attached. There were several bugs:


 I tested for the original bug report and patch definitely fixes that. I
 don't feel qualified enough with SP-Gist to really comment on the other
 bugs you reported and presumably fixed in the patch.


Heikki, did you get chance to commit your patch? IMHO we should get the bug
fix in before minor releases next week. My apologies if you've already
committed it and I've missed the commit message.

Thanks,
Pavan

-- 
Pavan Deolasee
http://www.linkedin.com/in/pavandeolasee


Re: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: [HACKERS] Add a filed to PageHeaderData)

2014-07-15 Thread Michael Paquier
On Wed, Jul 16, 2014 at 1:34 PM, Pavan Deolasee
pavan.deola...@gmail.com wrote:
 Heikki, did you get chance to commit your patch? IMHO we should get the bug
 fix in before minor releases next week. My apologies if you've already
 committed it and I've missed the commit message.
FWIW, this patch has not been committed yet. I am not seeing any
recent update on src/backend/utils/adt/rangetypes_spgist.c.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: [HACKERS] Add a filed to PageHeaderData)

2014-07-01 Thread Pavan Deolasee
On Wed, Jun 25, 2014 at 10:39 PM, Heikki Linnakangas 
hlinnakan...@vmware.com wrote:


 I came up with the attached. There were several bugs:


I tested for the original bug report and patch definitely fixes that. I
don't feel qualified enough with SP-Gist to really comment on the other
bugs you reported and presumably fixed in the patch.

Thanks,
Pavan


Re: Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: [HACKERS] Add a filed to PageHeaderData)

2014-06-25 Thread Heikki Linnakangas

On 06/24/2014 11:22 PM, Heikki Linnakangas wrote:

The real bug is in spg_range_quad_inner_consistent(), for the adjacent
operator. Things go wrong when:

The scan key is [100, 500)
The prev centroid is [500, 510)
The current centroid is [544, 554).

The row that should match but isn't returned, [500, 510) is equal to the
previous centroid. It's in quadrant 3 from the current centroid, but
spg_range_quad_inner_consistent() incorrectly concludes that it doesn't
need to scan that quadrant.

The function compares the scan key's upper bound with the the previous
centroid's lower bound and the current centroid's lower bound:


/*
  * Check if upper bound of argument is not in a
  * quadrant we visited in the previous step.
  */
cmp1 = range_cmp_bounds(typcache, upper, prevLower);
cmp2 = range_cmp_bounds(typcache, centroidLower,
prevLower);
if ((cmp2  0  cmp1  0) || (cmp2  0  cmp1  0))
 which2 = 0;


The idea is that if the scan key's upper bound doesn't fall between the
prev and current centroid's lower bounds, there is no match.

*   **
   PL   XCL

X = scan key's upper bound: 500)
PL = prev centroid's lower bound [500
CL = current centroid's lower bound [500

This is wrong. X  PL, but it's still nevertheless adjacent to it.

I'll take a closer look tomorrow...

(The if (which2) ... block after the code I quoted above also looks
wrong - it seems to be comparing the argument's lower bound when it
should be comparing the upper bound according to the comment. )


I came up with the attached. There were several bugs:

* The if (which2) { ... }  block was broken. It compared the 
argument's lower bound against centroid's upper bound, while it was 
supposed to compare the argument's upper bound against the centroid's 
lower bound (the comment was right). Also, it clear bits in the which1 
variable, while it was supposed to clear bits in which2. ISTM it was 
copy-pasted from the if (which1) { ... } block just before it, but not 
modified.


* If the argument's upper bound was equal to the centroid's lower bound, 
we descended to both halves (= all quadrants). That's unnecessary. 
Imagine that the argument is (x, 500), and the centroid is (500, y), so 
that the bounds are equal. The adjacent ranges that we're trying to find 
would be of form [500, z), which are to the right of the centroid. There 
is no need to visit the left quadrants. This won't lead to incorrect 
query results, but slows down queries unnecessarily.


* In the case that argument's lower bound is adjacent to the centroid's 
upper bound, we also don't need to visit all quadrants. Per similar 
reasoning as previous point.


* The code where we compare the previous centroid with the current 
centroid should match the code where we compare the current centroid 
with the argument. The point of that code is to redo the calculation 
done in the previous level, to see if we were supposed to traverse left 
or right (or up or down), and if we actually did. If we moved in the 
different direction, then we know there are no matches for bound.


Those could be fixed with quite small adjustments, but I think the code 
needs some refactoring. It's complicated as it is, it's very difficult 
to understand all the cases and comparisons. Case in point: the patch 
was written by Alexander, reviewed by Jeff, and committed by me, and we 
all missed the above bugs. So, I propose the attached.


I also wrote the attached little white-box testing tool for this. It 
allows you to call spg_range_quad_inner_consistent with the adjacent 
strategy, and pass the exact argument, centroid and prev centroid ranges 
you want. It prints out the result of which quadrants to visit.


- Heikki

diff --git a/src/backend/utils/adt/rangetypes_spgist.c b/src/backend/utils/adt/rangetypes_spgist.c
index a55cffa..1b83941 100644
--- a/src/backend/utils/adt/rangetypes_spgist.c
+++ b/src/backend/utils/adt/rangetypes_spgist.c
@@ -54,6 +54,12 @@ static int16 getQuadrant(TypeCacheEntry *typcache, RangeType *centroid,
 			RangeType *tst);
 static int	bound_cmp(const void *a, const void *b, void *arg);
 
+static int adjacent_inner_consistent(TypeCacheEntry *typcache,
+		  RangeBound *arg, RangeBound *centroid,
+		  RangeBound *prev);
+static int adjacent_cmp_bounds(TypeCacheEntry *typcache, RangeBound *arg,
+	RangeBound *centroid);
+
 /*
  * SP-GiST 'config' interface function.
  */
@@ -441,6 +447,11 @@ spg_range_quad_inner_consistent(PG_FUNCTION_ARGS)
 			bool		empty;
 			RangeType  *range = NULL;
 
+			RangeType  *prevCentroid = NULL;
+			RangeBound	prevLower,
+		prevUpper;
+			bool		prevEmpty;
+
 			/* Restrictions on range bounds according to scan strategy */
 			RangeBound *minLower = NULL,
 	   *maxLower = NULL,
@@ -550,109 +561,53 @@ spg_range_quad_inner_consistent(PG_FUNCTION_ARGS)
 		break;	/* Skip to strictEmpty check. */
 
 	/*
-	 * which1 is bitmask for possibility to be adjacent with
-	 * lower bound of argument. 

Bug in spg_range_quad_inner_consistent for adjacent operator (was Re: [HACKERS] Add a filed to PageHeaderData)

2014-06-24 Thread Heikki Linnakangas

On 06/24/2014 08:48 PM, Pavan Deolasee wrote:

FWIW I can reproduce this on HEAD with the attached patch. I could
reproduce this on a 64-bit Ubuntu as well as 64-bit Mac OSX. Very confusing
it is because I tried with various values for N in char[N] array and it
fails for N=20. Other values I tried are 4, 12, 22, 24 and the test passes
for all of them. The logic for trying other values is to see if pd_linp[]
starting on un-aligned boundary can trigger the issue. But there seem to be
no correlation.

postgres=# select version();

PostgreSQL 9.5devel on x86_64-apple-darwin13.2.0, compiled by Apple LLVM
version 5.1 (clang-503.0.38) (based on LLVM 3.4svn), 64-bit

postgres=# -- test SP-GiST index that's been built incrementally

postgres=# create table test_range_spgist(ir int4range);
postgres=# create index test_range_spgist_idx on test_range_spgist using
spgist (ir);
postgres=# insert into test_range_spgist select int4range(g, g+10) from
generate_series(1,586) g;
INSERT 0 586

postgres=# SET enable_seqscan= t;
postgres=# SET enable_indexscan  = f;
postgres=# SET enable_bitmapscan = f;

postgres=# select * from test_range_spgist where ir -|- int4range(100,500);
 ir
---
[90,100)
[500,510)
(2 rows)

postgres=# SET enable_seqscan= f;
postgres=# select * from test_range_spgist where ir -|- int4range(100,500);
 ir
---
  [90,100)
  [500,510)
(2 rows)

At this point, both rows are visible via index scan as well as seq scan.

postgres=# insert into test_range_spgist select int4range(g, g+10) from
generate_series(587,587) g;
INSERT 0 1

postgres=# select * from test_range_spgist where ir -|- int4range(100,500);
 ir
--
  [90,100)
(1 row)

Ouch. The second row somehow disappeared.

postgres=# SET enable_seqscan= t;
postgres=# select * from test_range_spgist where ir -|- int4range(100,500);
 ir
---
  [90,100)
  [500,510)
(2 rows)

So the last INSERT suddenly makes one row disappear via the index scan
though its still reachable via seq scan. I tried looking at the SP-Gist
code but clearly I don't understand it a whole lot to figure out the issue,
if one exists.


Yeah, I can reproduce this. It doesn't seem to be related to the padding 
or alignment at all. The padding just happens to move tuples around so 
that [500, 510) is picked as an SP-GiST inner node.


The real bug is in spg_range_quad_inner_consistent(), for the adjacent 
operator. Things go wrong when:


The scan key is [100, 500)
The prev centroid is [500, 510)
The current centroid is [544, 554).

The row that should match but isn't returned, [500, 510) is equal to the 
previous centroid. It's in quadrant 3 from the current centroid, but 
spg_range_quad_inner_consistent() incorrectly concludes that it doesn't 
need to scan that quadrant.


The function compares the scan key's upper bound with the the previous 
centroid's lower bound and the current centroid's lower bound:



/*
 * Check if upper bound of argument is not in a
 * quadrant we visited in the previous step.
 */
cmp1 = range_cmp_bounds(typcache, upper, prevLower);
cmp2 = range_cmp_bounds(typcache, centroidLower,
prevLower);
if ((cmp2  0  cmp1  0) || (cmp2  0  cmp1  0))
which2 = 0;


The idea is that if the scan key's upper bound doesn't fall between the 
prev and current centroid's lower bounds, there is no match.


  *   **
 PL   XCL

X = scan key's upper bound: 500)
PL = prev centroid's lower bound [500
CL = current centroid's lower bound [500

This is wrong. X  PL, but it's still nevertheless adjacent to it.

I'll take a closer look tomorrow...

(The if (which2) ... block after the code I quoted above also looks 
wrong - it seems to be comparing the argument's lower bound when it 
should be comparing the upper bound according to the comment. )


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers