Tom Lane wrote:
Aside from the "how did this happen" puzzle, the real point of any
investigation of course ought to be whether we can make heap_page_prune
more robust.  At the very least it's undesirable to be leaving the page
in a state where VACUUM FULL will decide it can't shrink.

I'm as puzzled as you are on how it happened.

The fundamental problem here is that we have HOT-updated tuples that are missing the root tuple. heap_prune_chain doesn't know how to remove the dead tuples from such chains, and neither does vacuum. What's worse is that there's no index pointer to the live tuple in the chain, and even VACUUM FULL doesn't fix that.

Could we modify heap_page_prune so that it detects such orphaned HOT tuples, and clears the HOT_UPDATED flag for them? That would at least let you recover. I'm pretty wary of trying to "fix" things after corruption in general since you can't be sure what's correct and what's not, but clearing the HOT_UPDATED flag seems safe and more likely to help than hurt. Clearing the flag would let the tuple become indexed again in a VACUUM FULL or REINDEX, so it would fix the inconsistency that otherwise a sequential scan can see the tuple but an index scan can not. It would also allow the next heap_page_prune operation to remove the remaining dead tuples in the chain.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to