Re: [HACKERS] issues with range types, btree_gist and constraints
On 7.2.2013 01:10, Tom Lane wrote: The attached patch fixes these things, but not the buggy penalty function, because we ought to try to verify that these changes are enough to prevent creation of an incorrect index. I can't reproduce any misbehavior anymore with this patch applied. However, I'm troubled by I can't reproduce any of the issues anymore - I've tested both 9.2 and 9.3 branches (both containing the already commited fixes). And not only that the issues seem to be gone, but I'm actually getting significantly better performance. For example this is the old code: test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 20001 Time: 17802,500 ms test=# analyze; ANALYZE Time: 122,192 ms test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 18590 Time: 81253,104 ms test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 7 Time: 11008,737 ms test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 1 Time: 21710,315 ms test=# and this is the new code: test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 20001 Time: 2196,802 ms test=# analyze; ANALYZE Time: 113,598 ms test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 18590 Time: 2369,028 ms test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 0 Time: 832,910 ms test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 0 Time: 778,241 ms That's more than 10x faster in all cases (and about 30x faster in some of them). I've also tested both branches (9.2 and 9.3) with the provided patch, and I can't reproduce any of the errors, but the performance seems to be equal to the old code. So it seems that the performance boost is due to the changes to the penalty function. Nice! your report that the behavior is unstable, because I get the same result each time if I start from an empty (truncated) table, with or without the patch. You may be seeing some further bug(s). Could you test this fix against your own test cases? This is what I meant by unstability: test=# truncate test; TRUNCATE TABLE test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 20001 test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 18590 test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 1 test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 0 test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 0 test=# truncate test; TRUNCATE TABLE test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 20001 test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 18590 test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 4 test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 0 test=# truncate test; TRUNCATE TABLE test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 20001 test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 18590 test=# copy test(id) from '/home/tomas/sample-1.csv'; COPY 3 test=# copy test(id) from '/home/tomas/sample-2.csv'; COPY 0 Notice the numbers change all the time. But I can't really reproduce these with the current HEAD (nor with the patch), so I assume these were caused by switching plans at different times. Seems fixed to me. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] issues with range types, btree_gist and constraints
Tomas Vondra t...@fuzzy.cz writes: I can't reproduce any of the issues anymore - I've tested both 9.2 and 9.3 branches (both containing the already commited fixes). And not only that the issues seem to be gone, but I'm actually getting significantly better performance. Cool. I noticed that it seemed faster too, but hadn't tried to quantify that. I've also tested both branches (9.2 and 9.3) with the provided patch, and I can't reproduce any of the errors, but the performance seems to be equal to the old code. So it seems that the performance boost is due to the changes to the penalty function. Nice! Yeah, the old penalty function was pretty badly broken with any non-C sort order. your report that the behavior is unstable, because I get the same result each time if I start from an empty (truncated) table, with or without the patch. You may be seeing some further bug(s). Could you test this fix against your own test cases? This is what I meant by unstability: Notice the numbers change all the time. [ scratches head... ] In HEAD, that's not so surprising because of commit ba1cc6501, which added some randomness to choices about which GiST page to insert new entries to (which could then affect whether the union-calculation bugs got exercised). If you saw that in older branches, though, it might merit some further investigation. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] issues with range types, btree_gist and constraints
I wrote: Tomas Vondra t...@fuzzy.cz writes: I've managed to further simplify the test-case, and I've verified that it's reproducible on current 9.2 and 9.3 branches. It seems that (1) gistfindgroup decides that SimpleTestString is equiv to something. It's not too clear what; for sure there is no other leaf key of the same value. The code in gistfindgroup looks a bit like it ought to think that all the items are equiv to one of the union datums or the other, but that's not the case --- no other item gets marked. After looking closer at this, I see that it's marking left-side tuples as equiv if they could be put in the right-side page for zero penalty, and similarly marking right-side tuples as equiv if they could be put in the left-side page for zero penalty. IOW it's finding the tuples for which it doesn't matter which side they go to. So this is not quite as insane as I thought, although the name equiv for the flag does not seem like le mot juste, and the function name and comment are rather misleading IMO. (After further investigation it seems that gbt_var_penalty is returning a negative penalty at the critical spot here, which might have something to do with it --- the business with the tmp[4] array seems many degrees away from trustworthy.) This is a bug, but the core gist code ought not generate invalid indexes just because the type-specific penalty function is doing something stupid. (2) cleanupOffsets removes the item for SimpleTestString from sv-spl_right[], evidently because it's equiv. (3) placeOne() sticks the entry into the spl_left[] array, which is an absolutely horrid decision: it means the left-side page will now need a range spanning the right-side page's range, when a moment ago they had been disjoint. The above behaviors are fine, really, because they are driven by the penalty function's claim that there's zero/negative cost to moving this tuple to the other side. (4) gistunionsubkey(), which apparently is supposed to fix the union datums to reflect this rearrangement, does no such thing because it only processes the index columns to the right of the one where the damage has been done. (It's not clear to me that it could ever be allowed to skip any columns, but that's what it's doing.) This is a bug. It's also a bug that gistunionsubkey() fails to null out the existing union keys for each side before calling gistMakeUnionItVec --- that can result in the recomputed union keys being larger than necessary. Furthermore, there's also a bug in gbt_var_bin_union(), the same one I saw previously that it does the wrong thing when both sides of the existing range need to be enlarged. The attached patch fixes these things, but not the buggy penalty function, because we ought to try to verify that these changes are enough to prevent creation of an incorrect index. I can't reproduce any misbehavior anymore with this patch applied. However, I'm troubled by your report that the behavior is unstable, because I get the same result each time if I start from an empty (truncated) table, with or without the patch. You may be seeing some further bug(s). Could you test this fix against your own test cases? There's a lot of other things I don't much like in gistsplit.c, but they're in the nature of cosmetic and documentation fixes. regards, tom lane diff --git a/contrib/btree_gist/btree_utils_var.c b/contrib/btree_gist/btree_utils_var.c index 9f8a132..691b103 100644 *** a/contrib/btree_gist/btree_utils_var.c --- b/contrib/btree_gist/btree_utils_var.c *** void *** 225,237 gbt_var_bin_union(Datum *u, GBT_VARKEY *e, Oid collation, const gbtree_vinfo *tinfo) { - GBT_VARKEY *nk = NULL; - GBT_VARKEY *tmp = NULL; - GBT_VARKEY_R nr; GBT_VARKEY_R eo = gbt_var_key_readable(e); if (eo.lower == eo.upper) /* leaf */ { tmp = gbt_var_leaf2node(e, tinfo); if (tmp != e) eo = gbt_var_key_readable(tmp); --- 225,237 gbt_var_bin_union(Datum *u, GBT_VARKEY *e, Oid collation, const gbtree_vinfo *tinfo) { GBT_VARKEY_R eo = gbt_var_key_readable(e); + GBT_VARKEY_R nr; if (eo.lower == eo.upper) /* leaf */ { + GBT_VARKEY *tmp; + tmp = gbt_var_leaf2node(e, tinfo); if (tmp != e) eo = gbt_var_key_readable(tmp); *** gbt_var_bin_union(Datum *u, GBT_VARKEY * *** 239,263 if (DatumGetPointer(*u)) { - GBT_VARKEY_R ro = gbt_var_key_readable((GBT_VARKEY *) DatumGetPointer(*u)); if ((*tinfo-f_cmp) (ro.lower, eo.lower, collation) 0) { nr.lower = eo.lower; ! nr.upper = ro.upper; ! nk = gbt_var_key_copy(nr, TRUE); } if ((*tinfo-f_cmp) (ro.upper, eo.upper, collation) 0) { nr.upper = eo.upper; ! nr.lower = ro.lower; ! nk = gbt_var_key_copy(nr, TRUE); } ! if (nk) ! *u = PointerGetDatum(nk); } else { --- 239,264 if (DatumGetPointer(*u)) { GBT_VARKEY_R ro =
Re: [HACKERS] issues with range types, btree_gist and constraints
Tomas Vondra t...@fuzzy.cz writes: I've managed to further simplify the test-case, and I've verified that it's reproducible on current 9.2 and 9.3 branches. I'm not sure why you're getting unstable behavior --- it's pretty stable for me, at least in a debug build. What I'm finding is that gistsplit.c appears to contain multiple bugs. If you just insert the first 271 rows of sample-1.csv, the 271'st row results in a gist page split. The split results in two pages having these root-page downlink entries: Item 2 -- Length: 88 Offset: 8000 (0x1f40) Flags: NORMAL Block Id: 2 linp Index: 65535 Size: 88 Has Nulls: 0 Has Varwidths: 1 1f40: 0200 5840 9390 00373637 ..X@.767 1f50: 61626536 31313961 30313834 34366263 abe6119a018446bc 1f60: 64373632 3738 61346632 6490 d7627ff8a4f2d... 1f70: 00626465 61613234 38323966 31626365 .bdeaa24829f1bce 1f80: 36343561 39313463 38386665 61613739 645a914c88feaa79 1f90: 340d440f 18004.D. (ie, range 767abe6119a018446bcd7627ff8a4f2d..bdeaa24829f1bce645a914c88feaa794) Item 3 -- Length: 72 Offset: 7928 (0x1ef8) Flags: NORMAL Block Id: 3 linp Index: 65535 Size: 72 Has Nulls: 0 Has Varwidths: 1 1ef8: 0300 4840 7390 00626538 ..H@sbe8 1f08: 30333261 33656138 31326565 62323838 032a3ea812eeb288 1f18: 64663361 65636161 30666361 3450 df3aecaa0fca4P.. 1f28: 0053696d 706c6554 65737453 7472696e .SimpleTestStrin 1f38: 670d440f 1800g.D. (ie, range be8032a3ea812eeb288df3aecaa0fca4..SimpleTestString) which looks fine .. but the leaf entry for SimpleTestString went into the first of these two pages, which is 100% wrong. The split produced by contrib/btree_gist is perfectly fine, and it assigns SimpleTestString to the right-hand page as expected. The bug is being introduced in lines 435..480 of gistsplit.c, which is code that doesn't fire unless we're working on a non-last column of the index (which is why you couldn't reproduce this in a 1-column index). It seems that (1) gistfindgroup decides that SimpleTestString is equiv to something. It's not too clear what; for sure there is no other leaf key of the same value. The code in gistfindgroup looks a bit like it ought to think that all the items are equiv to one of the union datums or the other, but that's not the case --- no other item gets marked. (After further investigation it seems that gbt_var_penalty is returning a negative penalty at the critical spot here, which might have something to do with it --- the business with the tmp[4] array seems many degrees away from trustworthy.) (2) cleanupOffsets removes the item for SimpleTestString from sv-spl_right[], evidently because it's equiv. (3) placeOne() sticks the entry into the spl_left[] array, which is an absolutely horrid decision: it means the left-side page will now need a range spanning the right-side page's range, when a moment ago they had been disjoint. (4) gistunionsubkey(), which apparently is supposed to fix the union datums to reflect this rearrangement, does no such thing because it only processes the index columns to the right of the one where the damage has been done. (It's not clear to me that it could ever be allowed to skip any columns, but that's what it's doing.) If I don't hear a commitment from Teodor to fix this, I'm going to rip out pretty much all of this logic as being under-documented and overly-buggy. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] issues with range types, btree_gist and constraints
Hi, I've managed to further simplify the test-case, and I've verified that it's reproducible on current 9.2 and 9.3 branches. This is the necessary table structure: --- CREATE TABLE test ( idTEXT, valid TSRANGE NOT NULL DEFAULT tsrange(NULL, NULL), CONSTRAINT unique_ids EXCLUDE USING GIST (id WITH =, valid WITH ) ); CREATE OR REPLACE FUNCTION skip_existing() RETURNS trigger AS $$ DECLARE v_exists BOOLEAN; BEGIN SELECT TRUE INTO v_exists FROM test WHERE id = NEW.id; IF v_exists THEN RETURN NULL; END IF; RETURN NEW; END; $$ LANGUAGE plpgsql; CREATE TRIGGER skip_existing BEFORE INSERT ON test FOR EACH ROW EXECUTE PROCEDURE skip_existing(); --- I've been unable to reproduce the bug with just a single text column. The trigger simply skips existing records without throwing any error. Now let's insert some data into the table - I'll use the same samples as in the previous test-case (http://www.fuzzy.cz/tmp/samples.tgz). test=# copy test(id) from '/tmp/sample-1.csv'; COPY 20001 test=# copy test(id) from '/tmp/sample-2.csv'; COPY 18590 test=# copy test(id) from '/tmp/sample-1.csv'; COPY 25 test=# copy test(id) from '/tmp/sample-2.csv'; COPY 45 The last two results are really suspicious - it means that some record were inserted again, so let's verify that: test=# select id, count(*) from test group by id having count(*) 1; id| count --+--- 0aab4791e1e41f62fd8452ae2c854a34 | 2 0aa08441cd4526b972bb3451d9f8e4ea | 2 0ab969a342837484ec0f81bf1e03 | 2 0aea0c33c76b1fe5d123b18cec184dc0 | 2 0af75a99b37be6dde08afaa69de36d29 | 2 0af80d2c2931756b897b3ca5a0055820 | 2 ... many more ... On 9.3 the number of duplicates is much lower, and it's not stable - on one run I get 6, on the very next one I get 2, then 5 and so on. That leads me to a suspicion that it might be an uninitialized variable or something like that. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] issues with range types, btree_gist and constraints
Tomas Vondra t...@fuzzy.cz writes: I've managed to further simplify the test-case, and I've verified that it's reproducible on current 9.2 and 9.3 branches. Yeah, I've been able to reproduce it as well. I found what I thought might be the bug: gbt_var_bin_union() looks like it does the wrong thing if both ends of the range need to be extended. However, fixing that doesn't fix your test case, so there's more to it. I've been unable to reproduce the bug with just a single text column. Hm. I noticed a lot of elog complaints about picksplit method ... doesn't support secondary split, so maybe the bug is in the code that's trying to deal with that. I've not had time to look closer. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers