On Thu, 27 Oct 2005, Tom Lane wrote:

> "Jim C. Nasby" <[EMAIL PROTECTED]> writes:
> > On Wed, Oct 26, 2005 at 09:29:23PM -0400, Tom Lane wrote:
> >> Could you send me the whole file (off-list)?
>
> > Ok, will send URL as soon as I have it from client.
>
> Well, the answer is that there's nothing wrong with that index except
> that four consecutive pages near the end (32K total) have been zeroed
> out :-(

[snip]

> Bottom line is that index searches probably ought to have some
> non-Assert defenses against zeroed-out pages.  Obviously we can't
> expect to catch every flavor of data corruption, but this particular
> one has been seen before...

Definately. I've seen faulty hardware somehow zero blocks where I would
have expected random data. I wonder if we can test with PageIsNew(), which
is very inexpensive. The question is: what do we do when we detect this?

>
> BTW, Jim, any thoughts about how the index got corrupted?  Have you
> had any crashes on that machine lately?

Have spoken with Jim on IRC, he says that there have been several crashes
recently due to a faulty disk array. I guess the zeroing could be an
outcome of the faulty disk. I wonder if the crash the faulty disk resulted
in could have been caused some where around mdextend() where we create a
zero'd page but before we could have written out the initialised page.

If this happened 4 times in a row it could account for the problem. It
does seem a bit unlikely thought.

That being said, is there any reason where don't extend the file with a
PageInit()'d block instead of a zero'd file?

Thanks,

Gavin

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

Reply via email to