John Lister <john.lister...@kickstone.com> writes:
> Originally in psql-admin, but copied here at the request of Tom to..

Thanks for forwarding this.  The reason I wanted to call it to the
attention of pgsql-hackers is that the page contents seem a bit odd,
and I'm not sure that we should just write it off as "pilot error".
What we've got here is a page full of transient states of the pg_class
row for temp_queue, which as you explained is constantly being
re-TRUNCATEd by your application.  So the data contents of each state
of the row vary only in relfilenode, as expected.  One would also expect
that all the copies on a particular page of pg_class would form a single
HOT chain (the database is 8.3.5).  What we've actually got is a chain
that is broken in two places and lacks a root tuple(!).

How could it have got that way?  John's ill-advised removal of the
transaction logs should have resulted only in the page being a lot older
than it should be, not in a logically corrupt page.

The only other corruption mechanism I can think of is that pg_clog might
contain commit bits for some logically inconsistent set of transaction
numbers, due to some pages of pg_clog having made it to disk and others
not.  That could result in some of the intermediate tuples in the chain
not being seen as dead --- but that's not what we see here either.

Aside from the "how did this happen" puzzle, the real point of any
investigation of course ought to be whether we can make heap_page_prune
more robust.  At the very least it's undesirable to be leaving the page
in a state where VACUUM FULL will decide it can't shrink.

Ideas anyone?

                        regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to