Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
Thanks for the advice. I ran a row-by-row test, including debug output. I'll put a test case together as well but I believe I have narrowed down the problem somewhat. The first split occurrs when the 6th row is inserted and there are 6 calls to Compress(), however picksplit only receives 4

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Alvaro Herrera
Excerpts from Peter Tanski's message of mar nov 23 12:00:52 -0300 2010: There are checks inside the Picksplit() function for the number of entries: OffsetNumber maxoff = entryvec-n - 1; int n_entries, j; n_entries = Max(maxoff, 1) - 1; j = 0; for (i = FirstOffsetNumber; i

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
I should correct what I just wrote: the first and last entries in entryvec-vector are invalid. On Nov 23, 2010, at 11:39 AM, Peter Tanski wrote: Picksplit() seems to be an exceptional case here: the first and last numbers in entryvec are invalid so entryvec-vector[entryvec-n - 1] is

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
Picksplit() seems to be an exceptional case here: the first and last numbers in entryvec are invalid so entryvec-vector[entryvec-n - 1] is invalid. All the other GiST code Picksplit() functions use the same convention. For example, see the btree_gist picksplit function, at

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
On Nov 23, 2010, at 1:37 PM, Yeb Havinga wrote: j = 0; for (i = FirstOffsetNumber; i maxoff; i = OffsetNumberNext(i)) { FPrint* v = deserialize_fprint(entv[i].key); Isn't this off by one? Offset numbers are 1-based, so the maxoff computation is wrong. The first for loop of all others

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Peter Tanski
I found another off-by-one error in my Picksplit() algorithm and the GiST index contains one leaf tuple for each row in the table now. The error was to start from 1 instead of 0 when assigning the entries. Thanks to everyone for your help. For the record, this is the only GiST index I know

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-23 Thread Oleg Bartunov
Peter, glad to know you succeeded. FYI, a year ago we developed GiST extension for rdkit.org. Oleg On Tue, 23 Nov 2010, Peter Tanski wrote: I found another off-by-one error in my Picksplit() algorithm and the GiST index contains one leaf tuple for each row in the table now. The error was

[HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-22 Thread Peter Tanski
I have been working on a plugin for GiST that has some unusual features: * The data type for both Node and Leaf keys is large (typically 4222 bytes on 32-bit; 4230 bytes on 64-bit). * Due to the large size the storage class is EXTENDED (main would only degrade to EXTENDED in any case).

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-22 Thread Peter Tanski
One minor correction: postgres=# explain select count(*) from fps2 f1 join fps2 f2 on f1.fingerprint = f2.fingerprint; QUERY PLAN Aggregate

Re: [HACKERS] GiST seems to drop left-branch leaf tuples

2010-11-22 Thread Heikki Linnakangas
On 22.11.2010 23:18, Peter Tanski wrote: Whatever test I use for Same(), Penalty() and Consistent() does not seem to affect the problem significantly. For now I am only using Consistent() as a check for retrieval. I believe it's not possible to lose leaf tuples with incorrectly defined gist