On Sat, Feb 19, 2022 at 3:08 PM Peter Geoghegan <p...@bowt.ie> wrote: > It's quite possible that this is nothing more than a bug in my > adversarial gizmo patch -- since I don't think that > ConditionalLockBufferForCleanup() can ever fail with a temp buffer > (though even that's not completely clear right now). Even if the > behavior that I saw does not indicate a bug on HEAD, it still seems > informative.
This very much looks like a bug in pg_surgery itself now -- attached is a draft fix. The temp table thing was a red herring. I found I could get exactly the same kind of failure when htab2 was a permanent table (which was how it originally appeared, before commit 0811f766fd made it into a temp table due to test flappiness issues). The relevant "vacuum freeze htab2" happens at a point after the test has already deliberately corrupted one of its tuples using heap_force_kill(). It's not that we aren't careful enough about the corruption at some point in vacuumlazy.c, which was my second theory. But I quickly discarded that idea, and came up with a third theory: the relevant heap_surgery.c path does the relevant ItemIdSetDead() to kill items, without also defragmenting the page to remove the tuples with storage, which is wrong. This meant that we depended on pruning happening (in this case during VACUUM) and defragmenting the page in passing. But there is no reason to not defragment the page within pg_surgery (at least no obvious reason), since we have a cleanup lock anyway. Theoretically you could blame this on lazy_scan_noprune instead, since it thinks it can collect LP_DEAD items while assuming that they have no storage, but that doesn't make much sense to me. There has never been any way of setting a heap item to LP_DEAD without also defragmenting the page. Since that's exactly what it means to prune a heap page. (Actually, the same used to be true about heap vacuuming, which worked more like heap pruning before Postgres 14, but that doesn't seem important.) -- Peter Geoghegan
From 81f01ca623b115647ee78a1b09bbb4458fb35dab Mon Sep 17 00:00:00 2001 From: Peter Geoghegan <p...@bowt.ie> Date: Sat, 19 Feb 2022 16:13:48 -0800 Subject: [PATCH 2/2] Fix for pg_surgery's heap_force_kill() function. --- contrib/pg_surgery/heap_surgery.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/contrib/pg_surgery/heap_surgery.c b/contrib/pg_surgery/heap_surgery.c index 3e641aa64..a3a193ba5 100644 --- a/contrib/pg_surgery/heap_surgery.c +++ b/contrib/pg_surgery/heap_surgery.c @@ -311,7 +311,8 @@ heap_force_common(FunctionCallInfo fcinfo, HeapTupleForceOption heap_force_opt) */ if (did_modify_page) { - /* Mark buffer dirty before we write WAL. */ + /* Defragment and mark buffer dirty before we write WAL. */ + PageRepairFragmentation(page); MarkBufferDirty(buf); /* XLOG stuff */ -- 2.30.2