Re: [HACKERS] Suggestions for post-mortem...

Alvaro Herrera Wed, 25 Jan 2006 06:26:41 -0800

Philip Warner wrote:
> We just had a DB die quite nastily, and have no clear idea why.
> 
> Looking in the system logs shows nothing out of the ordinary, and
> looking in the db logs shows a few odd records:
> 
> 2006-01-25 12:25:31 EST [mail,5017]: ERROR:  failed to fetch new tuple
> for AFTER trigger
> 2006-01-25 12:26:01 EST [mail,93689]: WARNING:  index "XXXX_pkey"
> contains 1416 row versions, but table contains 1410 row versions
> 2006-01-25 12:26:01 EST [mail,93689]: HINT:  Rebuild the index with REINDEX.
> 2006-01-25 12:26:01 EST [mail,93689]: WARNING:  index "YYYY" contains
> 1416 row versions, but table contains 1410 row versions


Seems like a rather severe bug.  Maybe race condition somewhere.  Note
that the trigger problem occurs 30 seconds before the VACUUM error shows
up, and being a small table I doubt a vacuum could take so long.

Can you confirm how long does the vacuum take to run?  Is this problem
isolated to this one table, or does it manifest somewhere else?  Do you
have other errors that may indicate a hardware problem?

It would be interesting to see where do the extra index entries point
to.  To do this we would need to read the complete pg_filedump of the
index however ... and if you REINDEXed it, the evidence is already gone.

-- 
Alvaro Herrera                        http://www.advogato.org/person/alvherre
"The Postgresql hackers have what I call a "NASA space shot" mentality.
 Quite refreshing in a world of "weekend drag racer" developers."
(Scott Marlowe)

---------------------------(end of broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

Re: [HACKERS] Suggestions for post-mortem...

Reply via email to