Re: [HACKERS] issues with range types, btree_gist and constraints

2013-02-07 Thread Tomas Vondra
On 7.2.2013 01:10, Tom Lane wrote:
 The attached patch fixes these things, but not the buggy penalty
 function, because we ought to try to verify that these changes are
 enough to prevent creation of an incorrect index.  I can't reproduce any
 misbehavior anymore with this patch applied.  However, I'm troubled by

I can't reproduce any of the issues anymore - I've tested both 9.2 and
9.3 branches (both containing the already commited fixes). And not only
that the issues seem to be gone, but I'm actually getting significantly
better performance. For example this is the old code:

test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
Time: 17802,500 ms
test=# analyze;
ANALYZE
Time: 122,192 ms
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
Time: 81253,104 ms
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 7
Time: 11008,737 ms
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 1
Time: 21710,315 ms
test=#

and this is the new code:

test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
Time: 2196,802 ms
test=# analyze;
ANALYZE
Time: 113,598 ms
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
Time: 2369,028 ms
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 0
Time: 832,910 ms
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 0
Time: 778,241 ms

That's more than 10x faster in all cases (and about 30x faster in some
of them).

I've also tested both branches (9.2 and 9.3) with the provided patch,
and I can't reproduce any of the errors, but the performance seems to be
equal to the old code. So it seems that the performance boost is due to
the changes to the penalty function. Nice!

 your report that the behavior is unstable, because I get the same result
 each time if I start from an empty (truncated) table, with or without
 the patch.  You may be seeing some further bug(s).  Could you test this
 fix against your own test cases?

This is what I meant by unstability:

test=# truncate test;
TRUNCATE TABLE
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 1
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 0
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 0
test=# truncate test;
TRUNCATE TABLE
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 4
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 0
test=# truncate test;
TRUNCATE TABLE
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 20001
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 18590
test=# copy test(id) from '/home/tomas/sample-1.csv';
COPY 3
test=# copy test(id) from '/home/tomas/sample-2.csv';
COPY 0

Notice the numbers change all the time. But I can't really reproduce
these with the current HEAD (nor with the patch), so I assume these were
caused by switching plans at different times.

Seems fixed to me.


Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] issues with range types, btree_gist and constraints

2013-02-07 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 I can't reproduce any of the issues anymore - I've tested both 9.2 and
 9.3 branches (both containing the already commited fixes). And not only
 that the issues seem to be gone, but I'm actually getting significantly
 better performance.

Cool.  I noticed that it seemed faster too, but hadn't tried to quantify
that.

 I've also tested both branches (9.2 and 9.3) with the provided patch,
 and I can't reproduce any of the errors, but the performance seems to be
 equal to the old code. So it seems that the performance boost is due to
 the changes to the penalty function. Nice!

Yeah, the old penalty function was pretty badly broken with any non-C
sort order.

 your report that the behavior is unstable, because I get the same result
 each time if I start from an empty (truncated) table, with or without
 the patch.  You may be seeing some further bug(s).  Could you test this
 fix against your own test cases?

 This is what I meant by unstability:
 
 Notice the numbers change all the time.

[ scratches head... ]  In HEAD, that's not so surprising because of
commit ba1cc6501, which added some randomness to choices about which
GiST page to insert new entries to (which could then affect whether the
union-calculation bugs got exercised).  If you saw that in older
branches, though, it might merit some further investigation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] issues with range types, btree_gist and constraints

2013-02-06 Thread Tom Lane
I wrote:
 Tomas Vondra t...@fuzzy.cz writes:
 I've managed to further simplify the test-case, and I've verified that
 it's reproducible on current 9.2 and 9.3 branches.

 It seems that

 (1) gistfindgroup decides that SimpleTestString is equiv to something.
 It's not too clear what; for sure there is no other leaf key of the same
 value.  The code in gistfindgroup looks a bit like it ought to think
 that all the items are equiv to one of the union datums or the other,
 but that's not the case --- no other item gets marked.

After looking closer at this, I see that it's marking left-side tuples
as equiv if they could be put in the right-side page for zero penalty,
and similarly marking right-side tuples as equiv if they could be put
in the left-side page for zero penalty.  IOW it's finding the tuples for
which it doesn't matter which side they go to.  So this is not quite as
insane as I thought, although the name equiv for the flag does not
seem like le mot juste, and the function name and comment are rather
misleading IMO.

 (After further
 investigation it seems that gbt_var_penalty is returning a negative
 penalty at the critical spot here, which might have something to
 do with it --- the business with the tmp[4] array seems many degrees
 away from trustworthy.)

This is a bug, but the core gist code ought not generate invalid indexes
just because the type-specific penalty function is doing something
stupid.

 (2) cleanupOffsets removes the item for SimpleTestString from
 sv-spl_right[], evidently because it's equiv.

 (3) placeOne() sticks the entry into the spl_left[] array, which is
 an absolutely horrid decision: it means the left-side page will now
 need a range spanning the right-side page's range, when a moment
 ago they had been disjoint.

The above behaviors are fine, really, because they are driven by the
penalty function's claim that there's zero/negative cost to moving this
tuple to the other side.

 (4) gistunionsubkey(), which apparently is supposed to fix the union
 datums to reflect this rearrangement, does no such thing because it
 only processes the index columns to the right of the one where the
 damage has been done.  (It's not clear to me that it could ever
 be allowed to skip any columns, but that's what it's doing.)

This is a bug.  It's also a bug that gistunionsubkey() fails to null out
the existing union keys for each side before calling gistMakeUnionItVec
--- that can result in the recomputed union keys being larger than
necessary.

Furthermore, there's also a bug in gbt_var_bin_union(), the same one
I saw previously that it does the wrong thing when both sides of the
existing range need to be enlarged.

The attached patch fixes these things, but not the buggy penalty
function, because we ought to try to verify that these changes are
enough to prevent creation of an incorrect index.  I can't reproduce any
misbehavior anymore with this patch applied.  However, I'm troubled by
your report that the behavior is unstable, because I get the same result
each time if I start from an empty (truncated) table, with or without
the patch.  You may be seeing some further bug(s).  Could you test this
fix against your own test cases?

There's a lot of other things I don't much like in gistsplit.c, but
they're in the nature of cosmetic and documentation fixes.

regards, tom lane

diff --git a/contrib/btree_gist/btree_utils_var.c b/contrib/btree_gist/btree_utils_var.c
index 9f8a132..691b103 100644
*** a/contrib/btree_gist/btree_utils_var.c
--- b/contrib/btree_gist/btree_utils_var.c
*** void
*** 225,237 
  gbt_var_bin_union(Datum *u, GBT_VARKEY *e, Oid collation,
    const gbtree_vinfo *tinfo)
  {
- 	GBT_VARKEY *nk = NULL;
- 	GBT_VARKEY *tmp = NULL;
- 	GBT_VARKEY_R nr;
  	GBT_VARKEY_R eo = gbt_var_key_readable(e);
  
  	if (eo.lower == eo.upper)	/* leaf */
  	{
  		tmp = gbt_var_leaf2node(e, tinfo);
  		if (tmp != e)
  			eo = gbt_var_key_readable(tmp);
--- 225,237 
  gbt_var_bin_union(Datum *u, GBT_VARKEY *e, Oid collation,
    const gbtree_vinfo *tinfo)
  {
  	GBT_VARKEY_R eo = gbt_var_key_readable(e);
+ 	GBT_VARKEY_R nr;
  
  	if (eo.lower == eo.upper)	/* leaf */
  	{
+ 		GBT_VARKEY *tmp;
+ 
  		tmp = gbt_var_leaf2node(e, tinfo);
  		if (tmp != e)
  			eo = gbt_var_key_readable(tmp);
*** gbt_var_bin_union(Datum *u, GBT_VARKEY *
*** 239,263 
  
  	if (DatumGetPointer(*u))
  	{
- 
  		GBT_VARKEY_R ro = gbt_var_key_readable((GBT_VARKEY *) DatumGetPointer(*u));
  
  		if ((*tinfo-f_cmp) (ro.lower, eo.lower, collation)  0)
  		{
  			nr.lower = eo.lower;
! 			nr.upper = ro.upper;
! 			nk = gbt_var_key_copy(nr, TRUE);
  		}
  
  		if ((*tinfo-f_cmp) (ro.upper, eo.upper, collation)  0)
  		{
  			nr.upper = eo.upper;
! 			nr.lower = ro.lower;
! 			nk = gbt_var_key_copy(nr, TRUE);
  		}
  
! 		if (nk)
! 			*u = PointerGetDatum(nk);
  	}
  	else
  	{
--- 239,264 
  
  	if (DatumGetPointer(*u))
  	{
  		GBT_VARKEY_R ro = 

Re: [HACKERS] issues with range types, btree_gist and constraints

2013-02-05 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 I've managed to further simplify the test-case, and I've verified that
 it's reproducible on current 9.2 and 9.3 branches.

I'm not sure why you're getting unstable behavior --- it's pretty stable
for me, at least in a debug build.  What I'm finding is that gistsplit.c
appears to contain multiple bugs.  If you just insert the first 271 rows
of sample-1.csv, the 271'st row results in a gist page split.  The split
results in two pages having these root-page downlink entries:

 Item   2 -- Length:   88  Offset: 8000 (0x1f40)  Flags: NORMAL
  Block Id: 2  linp Index: 65535  Size: 88
  Has Nulls: 0  Has Varwidths: 1

  1f40: 0200 5840 9390 00373637  ..X@.767
  1f50: 61626536 31313961 30313834 34366263  abe6119a018446bc
  1f60: 64373632 3738 61346632 6490  d7627ff8a4f2d...
  1f70: 00626465 61613234 38323966 31626365  .bdeaa24829f1bce
  1f80: 36343561 39313463 38386665 61613739  645a914c88feaa79
  1f90: 340d440f 18004.D.

(ie, range 767abe6119a018446bcd7627ff8a4f2d..bdeaa24829f1bce645a914c88feaa794)

 Item   3 -- Length:   72  Offset: 7928 (0x1ef8)  Flags: NORMAL
  Block Id: 3  linp Index: 65535  Size: 72
  Has Nulls: 0  Has Varwidths: 1

  1ef8: 0300 4840 7390 00626538  ..H@sbe8
  1f08: 30333261 33656138 31326565 62323838  032a3ea812eeb288
  1f18: 64663361 65636161 30666361 3450  df3aecaa0fca4P..
  1f28: 0053696d 706c6554 65737453 7472696e  .SimpleTestStrin
  1f38: 670d440f 1800g.D.

(ie, range be8032a3ea812eeb288df3aecaa0fca4..SimpleTestString)

which looks fine .. but the leaf entry for SimpleTestString went into
the first of these two pages, which is 100% wrong.

The split produced by contrib/btree_gist is perfectly fine, and it
assigns SimpleTestString to the right-hand page as expected.  The
bug is being introduced in lines 435..480 of gistsplit.c, which
is code that doesn't fire unless we're working on a non-last
column of the index (which is why you couldn't reproduce this in
a 1-column index).  It seems that

(1) gistfindgroup decides that SimpleTestString is equiv to something.
It's not too clear what; for sure there is no other leaf key of the same
value.  The code in gistfindgroup looks a bit like it ought to think
that all the items are equiv to one of the union datums or the other,
but that's not the case --- no other item gets marked.  (After further
investigation it seems that gbt_var_penalty is returning a negative
penalty at the critical spot here, which might have something to
do with it --- the business with the tmp[4] array seems many degrees
away from trustworthy.)

(2) cleanupOffsets removes the item for SimpleTestString from
sv-spl_right[], evidently because it's equiv.

(3) placeOne() sticks the entry into the spl_left[] array, which is
an absolutely horrid decision: it means the left-side page will now
need a range spanning the right-side page's range, when a moment
ago they had been disjoint.

(4) gistunionsubkey(), which apparently is supposed to fix the union
datums to reflect this rearrangement, does no such thing because it
only processes the index columns to the right of the one where the
damage has been done.  (It's not clear to me that it could ever
be allowed to skip any columns, but that's what it's doing.)

If I don't hear a commitment from Teodor to fix this, I'm going to
rip out pretty much all of this logic as being under-documented
and overly-buggy.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] issues with range types, btree_gist and constraints

2013-02-04 Thread Tomas Vondra
Hi,

I've managed to further simplify the test-case, and I've verified that
it's reproducible on current 9.2 and 9.3 branches.

This is the necessary table structure:

---
CREATE TABLE test (
idTEXT,
valid TSRANGE NOT NULL DEFAULT tsrange(NULL, NULL),
CONSTRAINT unique_ids EXCLUDE USING GIST (id WITH =, valid WITH )
);

CREATE OR REPLACE FUNCTION skip_existing() RETURNS trigger AS $$
DECLARE
v_exists BOOLEAN;
BEGIN

SELECT TRUE INTO v_exists FROM test WHERE id = NEW.id;
IF v_exists THEN
RETURN NULL;
END IF;

RETURN NEW;

END;
$$ LANGUAGE plpgsql;

CREATE TRIGGER skip_existing BEFORE INSERT ON test FOR EACH ROW EXECUTE
PROCEDURE skip_existing();
---

I've been unable to reproduce the bug with just a single text column.

The trigger simply skips existing records without throwing any error.

Now let's insert some data into the table - I'll use the same samples as
in the previous test-case (http://www.fuzzy.cz/tmp/samples.tgz).


test=# copy test(id) from '/tmp/sample-1.csv';
COPY 20001
test=# copy test(id) from '/tmp/sample-2.csv';
COPY 18590
test=# copy test(id) from '/tmp/sample-1.csv';
COPY 25
test=# copy test(id) from '/tmp/sample-2.csv';
COPY 45

The last two results are really suspicious - it means that some record
were inserted again, so let's verify that:

test=# select id, count(*) from test group by id having count(*)  1;
id| count
--+---
 0aab4791e1e41f62fd8452ae2c854a34 | 2
 0aa08441cd4526b972bb3451d9f8e4ea | 2
 0ab969a342837484ec0f81bf1e03 | 2
 0aea0c33c76b1fe5d123b18cec184dc0 | 2
 0af75a99b37be6dde08afaa69de36d29 | 2
 0af80d2c2931756b897b3ca5a0055820 | 2
 ... many more ...

On 9.3 the number of duplicates is much lower, and it's not stable - on
one run I get 6, on the very next one I get 2, then 5 and so on. That
leads me to a suspicion that it might be an uninitialized variable or
something like that.

Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] issues with range types, btree_gist and constraints

2013-02-04 Thread Tom Lane
Tomas Vondra t...@fuzzy.cz writes:
 I've managed to further simplify the test-case, and I've verified that
 it's reproducible on current 9.2 and 9.3 branches.

Yeah, I've been able to reproduce it as well.  I found what I thought
might be the bug: gbt_var_bin_union() looks like it does the wrong thing
if both ends of the range need to be extended.  However, fixing that
doesn't fix your test case, so there's more to it.

 I've been unable to reproduce the bug with just a single text column.

Hm.  I noticed a lot of elog complaints about picksplit method
... doesn't support secondary split, so maybe the bug is in the code
that's trying to deal with that.  I've not had time to look closer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers