[HACKERS] pg_filedump strangeness
Hi, I'm chasing an apparent index corruption problem, and I came across something I can't quite explain in pg_filedump. Say I dump a non-leaf btree index page: *** * PostgreSQL File/Block Formatted Dump Utility - Version 8.3.0 * * File: 31141 * Options used: -fi -x -R 6246 * * Dump created on: Tue Apr 6 17:40:28 2010 *** Block 6246 Header - Block Offset: 0x030cc000 Offsets: Lower 36 (0x0024) Block: Size 8192 Version4Upper8120 (0x1fb8) LSN: logid 1077 recoff 0x45c8b660 Special 8176 (0x1ff0) Items:3 Free Space: 8084 TLI: 0x0001 Prune XID: 0x Flags: 0x () Length (including item array): 36 : 3504 60b6c845 0100 2400b81f 5...`..E$... 0010: f01f0420 d89f3000 d09f1000 ... ..0. 0020: b89f3000 ..0. Data -- Item 1 -- Length: 24 Offset: 8152 (0x1fd8) Flags: NORMAL Block Id: 6232 linp Index: 1 Size: 24 Has Nulls: 32768 Has Varwidths: 0 1fd8: 5818 01001880 0100 ..X. 1fe8: 80bcc57d 74230100...}t#.. Item 2 -- Length:8 Offset: 8144 (0x1fd0) Flags: NORMAL Block Id: 2756 linp Index: 1 Size: 8 Has Nulls: 0 Has Varwidths: 0 1fd0: c40a 01000800 Item 3 -- Length: 24 Offset: 8120 (0x1fb8) Flags: NORMAL Block Id: 6231 linp Index: 1 Size: 24 Has Nulls: 32768 Has Varwidths: 0 1fb8: 5718 01001880 0100 ..W. 1fc8: 4009cc7f 73230100@...s#.. Special Section - BTree Index Section: Flags: 0x () Blocks: Previous (6109) Next (6305) Level (1) CycleId (0) 1ff0: dd17 a118 0100 *** End of Requested Range Encountered. Last Block Read: 6246 *** Notice how item 2 has size 8, but regular entries have size 24. I know this is related to the high key of this page, but I can't quite figure out why the short entry is 2 not 1. Is item 2 just assumed to be greater than the previous' page high key? Page's 6109 high key is: Item 1 -- Length: 24 Offset: 8152 (0x1fd8) Flags: NORMAL Block Id: 6101 linp Index: 1 Size: 24 Has Nulls: 32768 Has Varwidths: 0 1fd8: d517 01001880 0100 1fe8: 8004f17d 6f230100...}o#.. Note that the data values are integer timestamp without time zone in little endian byte order. (The Has Nulls bit is somewhat bogus -- it displays 32768 when the 0x8000 bit is on, which is rather surprising. I'd expect it to display 1). -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_filedump strangeness
Alvaro Herrera alvhe...@commandprompt.com writes: I'm chasing an apparent index corruption problem, and I came across something I can't quite explain in pg_filedump. Say I dump a non-leaf btree index page: I think this is actually OK. Remember that in a non-rightmost page, item 1 is the high key not a data entry. On the other hand, in a non-leaf page, we don't bother to store the key for the first downlink entry, since the associated key is really minus infinity. Cf nbtree/README: On a non-leaf page, the data items are down-links to child pages with bounding keys. The key in each data item is the *lower* bound for keys on that child page, so logically the key is to the left of that downlink. The high key (if present) is the upper bound for the last downlink. The first data item on each such page has no lower bound --- or lower bound of minus infinity, if you prefer. The comparison routines must treat it accordingly. The actual key stored in the item is irrelevant, and need not be stored at all. This arrangement corresponds to the fact that an LY non-leaf page has one more pointer than key. So item 2 doesn't have a key in it. The other two items have null keys, which means they need a null bitmap. I don't however understand why there seems to be data as well as a null bitmap in there --- is this perhaps a two-column index? (The Has Nulls bit is somewhat bogus -- it displays 32768 when the 0x8000 bit is on, which is rather surprising. I'd expect it to display 1). Yeah, I noticed that too. Made a note to myself to fix it in the next revision of pg_filedump, which I suppose I'd better get on with producing... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_filedump strangeness
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: I'm chasing an apparent index corruption problem, and I came across something I can't quite explain in pg_filedump. Say I dump a non-leaf btree index page: I think this is actually OK. Remember that in a non-rightmost page, item 1 is the high key not a data entry. On the other hand, in a non-leaf page, we don't bother to store the key for the first downlink entry, since the associated key is really minus infinity. Cf nbtree/README: On a non-leaf page, the data items are down-links to child pages with bounding keys. The key in each data item is the *lower* bound for keys on that child page, so logically the key is to the left of that downlink. The high key (if present) is the upper bound for the last downlink. The first data item on each such page has no lower bound --- or lower bound of minus infinity, if you prefer. The comparison routines must treat it accordingly. The actual key stored in the item is irrelevant, and need not be stored at all. This arrangement corresponds to the fact that an LY non-leaf page has one more pointer than key. Ahh, I had forgotten that bit completely. Thanks. So item 2 doesn't have a key in it. The other two items have null keys, which means they need a null bitmap. Correct. I don't however understand why there seems to be data as well as a null bitmap in there --- is this perhaps a two-column index? Eh, yeah, it's a two column index, so it's OK. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] pg_filedump strangeness
Tom Lane wrote: Alvaro Herrera alvhe...@commandprompt.com writes: I'm chasing an apparent index corruption problem, and I came across something I can't quite explain in pg_filedump. Say I dump a non-leaf btree index page: I think this is actually OK. Remember that in a non-rightmost page, item 1 is the high key not a data entry. Oh, BTW, this is not what's corrupted about this index -- I just had trouble following what pg_filedump was reporting. The corruption is more subtle: vacuum cannot find the parent page when trying to mark a page for deletion. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers