[HACKERS] pg_filedump strangeness

2010-04-06 Thread Alvaro Herrera
Hi,

I'm chasing an apparent index corruption problem, and I came across
something I can't quite explain in pg_filedump.  Say I dump a non-leaf
btree index page:

***
* PostgreSQL File/Block Formatted Dump Utility - Version 8.3.0
*
* File: 31141
* Options used: -fi -x -R 6246 
*
* Dump created on: Tue Apr  6 17:40:28 2010
***

Block 6246 
Header -
 Block Offset: 0x030cc000 Offsets: Lower  36 (0x0024)
 Block: Size 8192  Version4Upper8120 (0x1fb8)
 LSN:  logid   1077 recoff 0x45c8b660  Special  8176 (0x1ff0)
 Items:3  Free Space: 8084
 TLI: 0x0001  Prune XID: 0x  Flags: 0x ()
 Length (including item array): 36

  : 3504 60b6c845 0100 2400b81f  5...`..E$...
  0010: f01f0420  d89f3000 d09f1000  ... ..0.
  0020: b89f3000 ..0.

Data -- 
 Item   1 -- Length:   24  Offset: 8152 (0x1fd8)  Flags: NORMAL
  Block Id: 6232  linp Index: 1  Size: 24
  Has Nulls: 32768  Has Varwidths: 0

  1fd8: 5818 01001880 0100   ..X.
  1fe8: 80bcc57d 74230100...}t#..

 Item   2 -- Length:8  Offset: 8144 (0x1fd0)  Flags: NORMAL
  Block Id: 2756  linp Index: 1  Size: 8
  Has Nulls: 0  Has Varwidths: 0

  1fd0: c40a 01000800

 Item   3 -- Length:   24  Offset: 8120 (0x1fb8)  Flags: NORMAL
  Block Id: 6231  linp Index: 1  Size: 24
  Has Nulls: 32768  Has Varwidths: 0

  1fb8: 5718 01001880 0100   ..W.
  1fc8: 4009cc7f 73230100@...s#..


Special Section -
 BTree Index Section:
  Flags: 0x ()
  Blocks: Previous (6109)  Next (6305)  Level (1)  CycleId (0)

  1ff0: dd17 a118 0100   


*** End of Requested Range Encountered. Last Block Read: 6246 ***


Notice how item 2 has size 8, but regular entries have size 24.  I know
this is related to the high key of this page, but I can't quite figure
out why the short entry is 2 not 1.  Is item 2 just assumed to be
greater than the previous' page high key?

Page's 6109 high key is:

 Item   1 -- Length:   24  Offset: 8152 (0x1fd8)  Flags: NORMAL
  Block Id: 6101  linp Index: 1  Size: 24
  Has Nulls: 32768  Has Varwidths: 0

  1fd8: d517 01001880 0100   
  1fe8: 8004f17d 6f230100...}o#..


Note that the data values are integer timestamp without time zone in
little endian byte order.


(The Has Nulls bit is somewhat bogus -- it displays 32768 when the
0x8000 bit is on, which is rather surprising.  I'd expect it to display
1).


-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_filedump strangeness

2010-04-06 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 I'm chasing an apparent index corruption problem, and I came across
 something I can't quite explain in pg_filedump.  Say I dump a non-leaf
 btree index page:

I think this is actually OK.  Remember that in a non-rightmost page,
item 1 is the high key not a data entry.  On the other hand, in a
non-leaf page, we don't bother to store the key for the first downlink
entry, since the associated key is really minus infinity.  Cf
nbtree/README:

On a non-leaf page, the data items are down-links to child pages with
bounding keys.  The key in each data item is the *lower* bound for
keys on that child page, so logically the key is to the left of that
downlink.  The high key (if present) is the upper bound for the last
downlink.  The first data item on each such page has no lower bound
--- or lower bound of minus infinity, if you prefer.  The comparison
routines must treat it accordingly.  The actual key stored in the
item is irrelevant, and need not be stored at all.  This arrangement
corresponds to the fact that an LY non-leaf page has one more pointer
than key.

So item 2 doesn't have a key in it.  The other two items have null
keys, which means they need a null bitmap.  I don't however understand
why there seems to be data as well as a null bitmap in there --- is
this perhaps a two-column index?

 (The Has Nulls bit is somewhat bogus -- it displays 32768 when the
 0x8000 bit is on, which is rather surprising.  I'd expect it to display
 1).

Yeah, I noticed that too.  Made a note to myself to fix it in the next
revision of pg_filedump, which I suppose I'd better get on with
producing...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_filedump strangeness

2010-04-06 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  I'm chasing an apparent index corruption problem, and I came across
  something I can't quite explain in pg_filedump.  Say I dump a non-leaf
  btree index page:
 
 I think this is actually OK.  Remember that in a non-rightmost page,
 item 1 is the high key not a data entry.  On the other hand, in a
 non-leaf page, we don't bother to store the key for the first downlink
 entry, since the associated key is really minus infinity.  Cf
 nbtree/README:
 
   On a non-leaf page, the data items are down-links to child pages with
   bounding keys.  The key in each data item is the *lower* bound for
   keys on that child page, so logically the key is to the left of that
   downlink.  The high key (if present) is the upper bound for the last
   downlink.  The first data item on each such page has no lower bound
   --- or lower bound of minus infinity, if you prefer.  The comparison
   routines must treat it accordingly.  The actual key stored in the
   item is irrelevant, and need not be stored at all.  This arrangement
   corresponds to the fact that an LY non-leaf page has one more pointer
   than key.

Ahh, I had forgotten that bit completely.  Thanks.


 So item 2 doesn't have a key in it.  The other two items have null
 keys, which means they need a null bitmap.

Correct.

 I don't however understand
 why there seems to be data as well as a null bitmap in there --- is
 this perhaps a two-column index?

Eh, yeah, it's a two column index, so it's OK.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_filedump strangeness

2010-04-06 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@commandprompt.com writes:
  I'm chasing an apparent index corruption problem, and I came across
  something I can't quite explain in pg_filedump.  Say I dump a non-leaf
  btree index page:
 
 I think this is actually OK.  Remember that in a non-rightmost page,
 item 1 is the high key not a data entry.

Oh, BTW, this is not what's corrupted about this index -- I just had
trouble following what pg_filedump was reporting.  The corruption is
more subtle: vacuum cannot find the parent page when trying to mark a
page for deletion.

-- 
Alvaro Herrerahttp://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers