On Sat, Feb 19, 2022 at 3:08 PM Peter Geoghegan <p...@bowt.ie> wrote:
> It's quite possible that this is nothing more than a bug in my
> adversarial gizmo patch -- since I don't think that
> ConditionalLockBufferForCleanup() can ever fail with a temp buffer
> (though even that's not completely clear right now). Even if the
> behavior that I saw does not indicate a bug on HEAD, it still seems
> informative.

This very much looks like a bug in pg_surgery itself now -- attached
is a draft fix.

The temp table thing was a red herring. I found I could get exactly
the same kind of failure when htab2 was a permanent table (which was
how it originally appeared, before commit 0811f766fd made it into a
temp table due to test flappiness issues). The relevant "vacuum freeze
htab2" happens at a point after the test has already deliberately
corrupted one of its tuples using heap_force_kill().  It's not that we
aren't careful enough about the corruption at some point in
vacuumlazy.c, which was my second theory. But I quickly discarded that
idea, and came up with a third theory: the relevant heap_surgery.c
path does the relevant ItemIdSetDead() to kill items, without also
defragmenting the page to remove the tuples with storage, which is
wrong.

This meant that we depended on pruning happening (in this case during
VACUUM) and defragmenting the page in passing. But there is no reason
to not defragment the page within pg_surgery (at least no obvious
reason), since we have a cleanup lock anyway.

Theoretically you could blame this on lazy_scan_noprune instead, since
it thinks it can collect LP_DEAD items while assuming that they have
no storage, but that doesn't make much sense to me. There has never
been any way of setting a heap item to LP_DEAD without also
defragmenting the page.  Since that's exactly what it means to prune a
heap page. (Actually, the same used to be true about heap vacuuming,
which worked more like heap pruning before Postgres 14, but that
doesn't seem important.)

-- 
Peter Geoghegan
From 81f01ca623b115647ee78a1b09bbb4458fb35dab Mon Sep 17 00:00:00 2001
From: Peter Geoghegan <p...@bowt.ie>
Date: Sat, 19 Feb 2022 16:13:48 -0800
Subject: [PATCH 2/2] Fix for pg_surgery's heap_force_kill() function.

---
 contrib/pg_surgery/heap_surgery.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/contrib/pg_surgery/heap_surgery.c 
b/contrib/pg_surgery/heap_surgery.c
index 3e641aa64..a3a193ba5 100644
--- a/contrib/pg_surgery/heap_surgery.c
+++ b/contrib/pg_surgery/heap_surgery.c
@@ -311,7 +311,8 @@ heap_force_common(FunctionCallInfo fcinfo, 
HeapTupleForceOption heap_force_opt)
                 */
                if (did_modify_page)
                {
-                       /* Mark buffer dirty before we write WAL. */
+                       /* Defragment and mark buffer dirty before we write 
WAL. */
+                       PageRepairFragmentation(page);
                        MarkBufferDirty(buf);
 
                        /* XLOG stuff */
-- 
2.30.2

Reply via email to