On Sat, Feb 19, 2022 at 7:01 PM Andres Freund <and...@anarazel.de> wrote: > > We can either do that, or we can throw an error concerning corruption > > when heap_page_prune notices orphaned tuples. Neither seems > > particularly appealing. But it definitely makes no sense to allow > > lazy_scan_prune to spin in a futile attempt to reach agreement with > > heap_page_prune about a DEAD tuple really being DEAD. > > Yea, this sucks. I think we should go for the rewrite of the > heap_prune_chain() logic. The current approach is just never going to be > robust.
No, it just isn't robust enough. But it's not that hard to fix. My patch really wasn't invasive. I confirmed that HeapTupleSatisfiesVacuum() and heap_prune_satisfies_vacuum() agree that the heap-only tuple at offnum 2 is HEAPTUPLE_DEAD -- they are in agreement, as expected (so no reason to think that there is a new bug involved). The problem here is indeed just that heap_prune_chain() can't "get to" the tuple, given its current design. For anybody else that doesn't follow what we're talking about: The "doesn't chain to anything else" code at the start of heap_prune_chain() won't get to the heap-only tuple at offnum 2, since the tuple is itself HeapTupleHeaderIsHotUpdated() -- the expectation is that it'll be processed later on, once we locate the HOT chain's root item. Since, of course, the "root item" was already LP_DEAD before we even reached heap_page_prune() (on account of the pg_surgery corruption), there is no possible way that that can happen later on. And so we cannot find the same heap-only tuple and mark it LP_UNUSED (which is how we always deal with HEAPTUPLE_DEAD heap-only tuples) during pruning. -- Peter Geoghegan