Re: [HACKERS] what could cause this PANIC on enterprise 7.3.4 db?

Tom Lane Mon, 10 Nov 2003 07:27:40 -0800

Andriy Tkachuk <[EMAIL PROTECTED]> writes:
> On Fri, 7 Nov 2003, Tom Lane wrote:
>> Andriy Tkachuk <[EMAIL PROTECTED]> writes:
>>> Nov  5 20:22:42 monstr postgres[16071]: [3] PANIC:  open of 
>>> /usr/local/pgsql/data/pg_clog/0040 failed: No such file or directory
>> 
>> Could we see ls -l /usr/local/pgsql/data/pg_clog/


> [10:49]/2:[EMAIL PROTECTED]:~>sudo ls -al /usr/local/pgsql/data/pg_clog
> total 40
> drwx------    2 pgsql    postgres     4096 Nov  7 03:28 .
> drwx------    6 pgsql    root         4096 Oct 23 10:45 ..
> -rw-------    1 pgsql    postgres    32768 Nov 10 10:47 000D

Okay, given that the file the code was trying to access is nowhere near
the current or past set of valid transaction numbers, it's pretty clear
that what you have is a corrupted transaction number in some tuple's
header.  The odds are that not only the transaction number is affected;
usually when we see something like this, anywhere from dozens to
hundreds of bytes have been replaced by garbage data.

In the cases I've been able to study in the past, the cause seemed to
be faulty hardware or possibly kernel bugs --- for instance someone
recently reported a case where a whole kilobyte of a Postgres file had
been replaced with what seemed to be part of a mail message.  I'd
ascribe that to either a disk drive writing a sector at the wrong place,
or the kernel getting confused about which buffer held which file.
So I'd recommend running some hardware diagnostics and checking to see
if there are errata available for your kernel.

As far as cleaning up the immediate damage is concerned, you'll probably
want to use pg_filedump or some such tool to get a better feeling for
the extent of the damage.  There are descriptions of this process in the
archives --- try looking for recent references to pg_filedump.

                        regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if your
      joining column's datatypes do not match

Re: [HACKERS] what could cause this PANIC on enterprise 7.3.4 db?

Reply via email to