Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-16 Thread Nikhil Sontakke
Hi, To summarize, as I see it - the zeroed out block 523 should have been the second left-most leaf and should have pointed out to 522. Thus re-establishing the index chain 524 - 523 - 522 - 277 - ... Was there a machine restart in the picture as well? It seems there might have been a

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-16 Thread Robert Haas
On Wed, Mar 16, 2011 at 7:51 AM, Nikhil Sontakke nikhil.sonta...@enterprisedb.com wrote: Hi, To summarize, as I see it - the zeroed out block 523 should have been the second left-most leaf and should have pointed out to 522. Thus re-establishing the index chain 524 - 523 - 522 - 277 - ...

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-16 Thread Nikhil Sontakke
Hi, Of course, as you mentioned earlier, it's not impossible there's a bug in the recovery code. Yeah, I was looking at the repair_frag function in 8.3.13 (yup it's ugly!) and found out that the normal ExecInsertIndexTuples call is used to insert the index entries. That is standard index code

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-16 Thread Alvaro Herrera
Excerpts from Nikhil Sontakke's message of mié mar 16 08:51:00 -0300 2011: Hi, To summarize, as I see it - the zeroed out block 523 should have been the second left-most leaf and should have pointed out to 522. Thus re-establishing the index chain 524 - 523 - 522 - 277 - ... Was

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-14 Thread Nikhil Sontakke
Hi Daniel, I have also, coincidentally, encountered corruption of a system catalog index -- 8.3.11 -- I have saved the file for forensics.  Is it possible that I also receive a copy of this program? Will it be possible for you to share the file/logs off-list with me? I can also try to do

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-12 Thread Daniel Farina
On Wed, Mar 9, 2011 at 6:02 AM, Alvaro Herrera alvhe...@commandprompt.com wrote: I'll send you a perl program we wrote for a customer to check for strange issues in btrees.  Please give it a spin; it may give you more clues.  If you find additional checks to add, please let me know! I have

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-12 Thread Greg Stark
On Sat, Mar 12, 2011 at 3:06 AM, Nikhil Sontakke nikhil.sonta...@enterprisedb.com wrote: Live 522's      (LSN: logid 29, recoff 0xd1fade3c) previous points to the zeroed out 523 block. Note that this seems to be latest LSN in the data file. So do you have logs from the server when it was

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-11 Thread Nikhil Sontakke
Hi, 1. Somebody inserts a bunch of new tuples into the relation, causing growth in the index. In case it's not obvious VACUUM FULL would do precisely that. Oh, I didn't even think about that.  Yeah, that could be it, too. Thanks a lot Greg and Robert. This theory seems very plausible. VF

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-11 Thread Robert Haas
On Fri, Mar 11, 2011 at 6:17 AM, Nikhil Sontakke nikhil.sonta...@enterprisedb.com wrote: VACUUM FULL - immediate shutdown - problem with recovery? An immediate shutdown == an intentional crash. OK, so you have the VACUUM FULL and the immediate shutdown just afterward. So we just need to figure

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-11 Thread Nikhil Sontakke
VACUUM FULL - immediate shutdown - problem with recovery? An immediate shutdown == an intentional crash. OK, so you have the VACUUM FULL and the immediate shutdown just afterward. So we just need to figure out what happened during recovery. Right. But WAL replay should still have

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-11 Thread Greg Stark
On Fri, Mar 11, 2011 at 2:28 PM, Nikhil Sontakke nikhil.sonta...@enterprisedb.com wrote: I'm not sure, but I doubt it.  If the VACUUM FULL committed, then the WAL records should be on disk, but if the immediate shutdown happened while it was still running, then the WAL records might still be in

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-11 Thread Nikhil Sontakke
Oh yeah, so if VF committed, the xlog should have been ok too, but can't say the same about the shared buffers. But there was a later block that *was* written out. What was the LSN on that block? everything in the WAL log should have been fsynced up to that point when that buffer was

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-10 Thread Nikhil Sontakke
Hi, Other peculiarity in the index file is that we found a lot of zeroed out pages. Blocks from #279 to #518 are all completely zeroed out without any signs of even a page header. Any ideas on how we can get so many zeroed out blocks? Apart from the extend code path, I fail to see any other.

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-10 Thread Robert Haas
On Wed, Mar 9, 2011 at 7:14 PM, Greg Stark gsst...@mit.edu wrote: On Wed, Mar 9, 2011 at 11:28 PM, Nikhil Sontakke nikhil.sonta...@enterprisedb.com wrote: Other peculiarity in the index file is that we found a lot of zeroed out pages. Blocks from #279 to #518 are all completely zeroed out

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-10 Thread Greg Stark
On Thu, Mar 10, 2011 at 1:45 PM, Robert Haas robertmh...@gmail.com wrote: 1. Somebody inserts a bunch of new tuples into the relation, causing growth in the index. In case it's not obvious VACUUM FULL would do precisely that. 2. Before the blocks are written to disk, system crash. 3.

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-10 Thread Robert Haas
On Thu, Mar 10, 2011 at 12:52 PM, Greg Stark gsst...@mit.edu wrote: On Thu, Mar 10, 2011 at 1:45 PM, Robert Haas robertmh...@gmail.com wrote: 1. Somebody inserts a bunch of new tuples into the relation, causing growth in the index. In case it's not obvious VACUUM FULL would do precisely that.

[HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-09 Thread Nikhil Sontakke
Re-sending without the attachments. Can someone please allow my attachments through from the previous email? TIA Nikhils -- Forwarded message -- From: Nikhil Sontakke nikhil.sonta...@enterprisedb.com Date: Wed, Mar 9, 2011 at 8:42 PM Subject: index corruption in PG 8.3.13 To:

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-09 Thread Alvaro Herrera
Excerpts from Nikhil Sontakke's message of mié mar 09 10:51:50 -0300 2011: Re-sending without the attachments. Can someone please allow my attachments through from the previous email? They are not in the moderation queue, so presumably they got eaten by the antispam grue. Blocks 519 to 521

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-09 Thread Alvaro Herrera
Excerpts from Nikhil Sontakke's message of mié mar 09 11:16:22 -0300 2011: Re-sending without the attachments. Can someone please allow my attachments through from the previous email? They are not in the moderation queue, so presumably they got eaten by the antispam grue. Ouch.

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-09 Thread Nikhil Sontakke
Ouch. Attempting to attach the dotty image again.. I don't understand this graph.  What are the arrows?  Downlinks or sibling pointers? Sorry, they are sibling previous and next pointers. The ROOT is at level 1, rest all live blocks are at level 0. #524 is the leftmost page. Regards,

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-09 Thread Nikhil Sontakke
Hi, Blocks 519 to 521 are DELETED. They do not have the LEAF flag set, meaning they could be internal pages, but that is strange since ROOT page is at level 1. Also importantly their next XID is set FrozenXid, meaning VACUUM FULL was at play. Maybe due to deletes, we reduced the hierarchy

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-09 Thread Greg Stark
On Wed, Mar 9, 2011 at 11:28 PM, Nikhil Sontakke nikhil.sonta...@enterprisedb.com wrote: Other peculiarity in the index file is that we found a lot of zeroed out pages. Blocks from #279 to #518 are all completely zeroed out without any signs of even a page header. Any ideas on how we can get

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-09 Thread Nikhil Sontakke
What does stat say for the index data file? Are the Size and Blocks values the same (modulo block size)? Or are these blocks actually not allocated? stat 58401 File: `58401' Size: 4300800 Blocks: 8400 IO Block: 4096 regular file Device: 801h/2049d Inode: 13901264

Re: [HACKERS] Fwd: index corruption in PG 8.3.13

2011-03-09 Thread Alvaro Herrera
Excerpts from Nikhil Sontakke's message of mié mar 09 20:28:19 -0300 2011: While I rummage around the code more, does anyone have any theories on the below? Other peculiarity in the index file is that we found a lot of zeroed out pages. Blocks from #279 to #518 are all completely zeroed out