Thanks for the advice. I ran a row-by-row test, including debug output. I'll
put a test case together as well but I believe I have narrowed down the problem
somewhat. The first split occurrs when the 6th row is inserted and there are 6
calls to Compress(), however picksplit only receives 4 of those 6 tuples and
the other two are dropped.
postgres=# \i xaa
psql:xaa:1: NOTICE: [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
gist_stat
---------------------------------------
Number of levels: 1 +
Number of pages: 1 +
Number of leaf pages: 1 +
Number of tuples: 1 +
Number of invalid tuples: 0 +
Number of leaf tuples: 1 +
Total size of tuples: 1416 bytes+
Total size of leaf tuples: 1416 bytes+
Total size of index: 8192 bytes+
postgres=# \i xab
psql:xab:1: NOTICE: [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
gist_stat
---------------------------------------
Number of levels: 1 +
Number of pages: 1 +
Number of leaf pages: 1 +
Number of tuples: 2 +
Number of invalid tuples: 0 +
Number of leaf tuples: 2 +
Total size of tuples: 2820 bytes+
Total size of leaf tuples: 2820 bytes+
Total size of index: 8192 bytes+
postgres=# \i xac
psql:xac:1: NOTICE: [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
gist_stat
---------------------------------------
Number of levels: 1 +
Number of pages: 1 +
Number of leaf pages: 1 +
Number of tuples: 3 +
Number of invalid tuples: 0 +
Number of leaf tuples: 3 +
Total size of tuples: 4224 bytes+
Total size of leaf tuples: 4224 bytes+
Total size of index: 8192 bytes+
postgres=# \i xad
psql:xad:1: NOTICE: [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
gist_stat
---------------------------------------
Number of levels: 1 +
Number of pages: 1 +
Number of leaf pages: 1 +
Number of tuples: 4 +
Number of invalid tuples: 0 +
Number of leaf tuples: 4 +
Total size of tuples: 5628 bytes+
Total size of leaf tuples: 5628 bytes+
Total size of index: 8192 bytes+
postgres=# \i xae
psql:xae:1: NOTICE: [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
gist_stat
---------------------------------------
Number of levels: 1 +
Number of pages: 1 +
Number of leaf pages: 1 +
Number of tuples: 5 +
Number of invalid tuples: 0 +
Number of leaf tuples: 5 +
Total size of tuples: 7032 bytes+
Total size of leaf tuples: 7032 bytes+
Total size of index: 8192 bytes+
postgres=# \i xaf
psql:xaf:1: NOTICE: [pgfprint.c:fprint_compress:379] entered compress
psql:xaf:1: NOTICE: [pgfprint.c:fprint_decompress:421] entered decompress
psql:xaf:1: NOTICE: [pgfprint.c:fprint_decompress:421] entered decompress
psql:xaf:1: NOTICE: [pgfprint.c:fprint_decompress:421] entered decompress
psql:xaf:1: NOTICE: [pgfprint.c:fprint_decompress:421] entered decompress
psql:xaf:1: NOTICE: [pgfprint.c:fprint_decompress:421] entered decompress
psql:xaf:1: NOTICE: [pgfprint.c:fprint_decompress:421] entered decompress
psql:xaf:1: NOTICE: [pgfprint.c:fprint_picksplit:660] entered picksplit
psql:xaf:1: NOTICE: [pgfprint.c:fprint_picksplit:812] split: 2 left, 2 right
psql:xaf:1: NOTICE: [pgfprint.c:fprint_compress:379] entered compress
psql:xaf:1: NOTICE: [pgfprint.c:fprint_compress:379] entered compress
INSERT 0 1
postgres=# select gist_stat('fps2_fingerprint_ix');
gist_stat
----------------------------------------
Number of levels: 2 +
Number of pages: 3 +
Number of leaf pages: 2 +
Number of tuples: 6 +
Number of invalid tuples: 0 +
Number of leaf tuples: 4 +
Total size of tuples: 8460 bytes +
Total size of leaf tuples: 5640 bytes +
Total size of index: 24576 bytes+
postgres=#
There are checks inside the Picksplit() function for the number of entries:
OffsetNumber maxoff = entryvec->n - 1;
int n_entries, j;
n_entries = Max(maxoff, 1) - 1;
j = 0;
for (i = FirstOffsetNumber; i < maxoff; i = OffsetNumberNext(i)) {
FPrint* v = deserialize_fprint(entv[i].key);
if (!GIST_LEAF(&entv[i])) {
leaf_split = false;
}
if (v == NULL) {
elog(ERROR, "entry %d is invalid", i);
}
raw_vec[j] = v;
vec_ixs[j++] = i;
}
if (n_entries > j) {
elog(WARNING, "[%s:%s:%d]: " SIZE_T_FMT " bad entries",
__FILE__, __func__, __LINE__, n_entries - j);
n_entries = j;
} else if (n_entries < j) {
elog(ERROR, "skipping %d entries", j-n_entries);
}
So I know the number of entries sent to Picksplit() is 4, for 6 calls to
decompress.
Note that Decompress() returns the input unchanged and entries are untoasted in
the deserialize_fprint() function, which malloc's each value:
Datum fprint_decompress(PG_FUNCTION_ARGS) {
GISTENTRY* entry = (GISTENTRY*)PG_GETARG_POINTER(0);
FPDEBUG("entered decompress");
if (!entry) {
elog(ERROR, "fprint_decompress: entry is NULL");
}
// cut out here -- we handle the memory
PG_RETURN_POINTER(entry);
}
I'll put together a test case and send that on.
On Nov 23, 2010, at 2:29 AM, Heikki Linnakangas wrote:
> On 22.11.2010 23:18, Peter Tanski wrote:
>> Whatever test I use for Same(), Penalty() and Consistent() does not seem
>> to affect the problem significantly. For now I am only using
>> Consistent() as a check for retrieval.
>
> I believe it's not possible to lose leaf tuples with incorrectly defined gist
> support functions. You might get completely bogus results, but the tuples
> should be there when you look at gist_tree() output. So this sounds like a
> gist bug to me.
>
>> Note that there are only 133 leaf tuples -- for 500 rows. If the index
>> process were operating correctly, there should have been 500 leaf tuples
>> there. If I REINDEX the table the number of leaf tuples may change
>> slightly but not by much.
>
> One idea for debugging is to insert the rows to the table one by one, and run
> the query after each insertion. When do the leaf tuples disappear?
>
> If you can put together a small self-contained test case and post it to the
> list, I can take a look.
>
> --
> Heikki Linnakangas
> EnterpriseDB http://www.enterprisedb.com
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers