On 26/11/2025 10:20, Heikki Linnakangas wrote:
On 25/11/2025 22:51, Peter Geoghegan wrote:
In short, I think that we need to track something like
BtreeCheckState.previncompletesplit, but for half-dead pages. And then
actually use that within bt_child_highkey_check, to avoid spurious
false-positive reports of corruption.
I think it's even simpler than that, and we can just do this:
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -2268,7 +2268,7 @@ bt_child_highkey_check(BtreeCheckState *state,
* If we visit page with high key, check that it is equal to the
* target key next to corresponding downlink.
*/
- if (!rightsplit && !P_RIGHTMOST(opaque))
+ if (!rightsplit && !P_RIGHTMOST(opaque) && !P_ISHALFDEAD(opaque))
{
BTPageOpaque topaque;
IndexTuple highkey;
Both BTP_HALF_DEAD and BTP_INCOMPLETE_SPLIT indicate that a downlink is
missing, but they work slightly differently. If a page is marked as
BTP_INCOMPLETE_SPLIT, it means that its right sibling has no downlink,
but if a page is marked as BTP_HALF_DEAD, it means that the page itself
has no downlink.
Ok, here's a proper patch with tests. The patch itself is the above
one-liner. It's in patch 0004.
While testing this, I bumped into another similar amcheck bug: if the
root page split is interrupted, verify_btree() complains:
ERROR: block 3 is not true root in index "nbtree_incomplete_splits_i_idx"
Attached patch 0002 contains a fix and a test for that. The fix for that
is also one-liner.
Summary of the patches:
Patch 0001 adds an injection point test for incomplete splits. We
already had such a test for GIN, which handles incomplete splits the
same way as B-tree. I copy-pasted and adapted the GIN test for B-tree.
This was an easy way to increase our test coverage.
Patch 0002 fixes the incomplete-root-split bug in amcheck. It modifies
the test added in patch 0001 to cover the bug fix.
Patch 0003 adds a test for half-dead pages, similar to what 0001 did for
incomplete splits.
Patch 0004 fixes the bogus half-deaf-page error in amcheck, i.e. the
issue that started this thread. It modifies the test introduced in patch
0003 to add amcheck calls, to cover the bug fix.
- Heikki
From a72a2ec9e093e945b0e749c8f2efda8a0c4c7373 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 1 Dec 2025 23:18:23 +0200
Subject: [PATCH v1 1/4] Add a test for incomplete splits in B-tree indexes
To increase our test coverage in general, and because I will add onto
this in the next commits to also test amcheck with incomplete splits.
This is copied from the similar test we had for GIN indexes. B-tree's
incomplete splits work similarly to GIN's, so with small changes, the
same test works for B-tree too.
Discussion: https://www.postgresql.org/message-id/[email protected]
---
src/backend/access/nbtree/nbtinsert.c | 9 +
src/test/modules/meson.build | 1 +
src/test/modules/nbtree/Makefile | 28 +++
.../expected/nbtree_incomplete_splits.out | 179 ++++++++++++++++++
src/test/modules/nbtree/meson.build | 16 ++
.../nbtree/sql/nbtree_incomplete_splits.sql | 134 +++++++++++++
6 files changed, 367 insertions(+)
create mode 100644 src/test/modules/nbtree/Makefile
create mode 100644 src/test/modules/nbtree/expected/nbtree_incomplete_splits.out
create mode 100644 src/test/modules/nbtree/meson.build
create mode 100644 src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql
diff --git a/src/backend/access/nbtree/nbtinsert.c b/src/backend/access/nbtree/nbtinsert.c
index 7c113c007e5..3a4b791f2ab 100644
--- a/src/backend/access/nbtree/nbtinsert.c
+++ b/src/backend/access/nbtree/nbtinsert.c
@@ -26,6 +26,7 @@
#include "miscadmin.h"
#include "storage/lmgr.h"
#include "storage/predicate.h"
+#include "utils/injection_point.h"
/* Minimum tree height for application of fastpath optimization */
#define BTREE_FASTPATH_MIN_LEVEL 2
@@ -1239,6 +1240,13 @@ _bt_insertonpg(Relation rel,
* page.
*----------
*/
+#ifdef USE_INJECTION_POINTS
+ if (P_ISLEAF(opaque))
+ INJECTION_POINT("nbtree-leave-leaf-split-incomplete", NULL);
+ else
+ INJECTION_POINT("nbtree-leave-internal-split-incomplete", NULL);
+#endif
+
_bt_insert_parent(rel, heaprel, buf, rbuf, stack, isroot, isonly);
}
else
@@ -2285,6 +2293,7 @@ _bt_finish_split(Relation rel, Relation heaprel, Buffer lbuf, BTStack stack)
/* Was this the only page on the level before split? */
wasonly = (P_LEFTMOST(lpageop) && P_RIGHTMOST(rpageop));
+ INJECTION_POINT("nbtree-finish-incomplete-split", NULL);
elog(DEBUG1, "finishing incomplete split of %u/%u",
BufferGetBlockNumber(lbuf), BufferGetBlockNumber(rbuf));
diff --git a/src/test/modules/meson.build b/src/test/modules/meson.build
index f5114469b92..cc57461e59a 100644
--- a/src/test/modules/meson.build
+++ b/src/test/modules/meson.build
@@ -10,6 +10,7 @@ subdir('index')
subdir('injection_points')
subdir('ldap_password_func')
subdir('libpq_pipeline')
+subdir('nbtree')
subdir('oauth_validator')
subdir('plsample')
subdir('spgist_name_ops')
diff --git a/src/test/modules/nbtree/Makefile b/src/test/modules/nbtree/Makefile
new file mode 100644
index 00000000000..34946a84445
--- /dev/null
+++ b/src/test/modules/nbtree/Makefile
@@ -0,0 +1,28 @@
+# src/test/modules/nbtree/Makefile
+
+EXTRA_INSTALL = src/test/modules/injection_points
+
+REGRESS = nbtree_incomplete_splits
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = src/test/modules/nbtree
+top_builddir = ../../../..
+include $(top_builddir)/src/Makefile.global
+
+# XXX: This test is conditional on enable_injection_points in the
+# parent Makefile, so we should never get here in the first place if
+# injection points are not enabled. But the buildfarm 'misc-check'
+# step doesn't pay attention to the if-condition in the parent
+# Makefile. To work around that, disable running the test here too.
+ifeq ($(enable_injection_points),yes)
+include $(top_srcdir)/contrib/contrib-global.mk
+else
+check:
+ @echo "injection points are disabled in this build"
+endif
+
+endif
diff --git a/src/test/modules/nbtree/expected/nbtree_incomplete_splits.out b/src/test/modules/nbtree/expected/nbtree_incomplete_splits.out
new file mode 100644
index 00000000000..88e87e875c8
--- /dev/null
+++ b/src/test/modules/nbtree/expected/nbtree_incomplete_splits.out
@@ -0,0 +1,179 @@
+--
+-- Test incomplete splits in B-tree indexes.
+--
+-- We use a test table with integers from 1 to :next_i. Each integer
+-- occurs exactly once, no gaps or duplicates, although the index does
+-- contain some duplicates because some of the inserting transactions
+-- are rolled back during the test. The exact contents of the table
+-- depend on the physical layout of the index, which in turn depends
+-- at least on the block size, so instead of checking the exact
+-- contents, we check those invariants. :next_i psql variable is
+-- maintained at all times to hold the last inserted integer + 1.
+--
+-- This uses injection points to cause errors that leave some page
+-- splits in "incomplete" state
+set client_min_messages TO 'warning';
+create extension if not exists injection_points;
+reset client_min_messages;
+-- Make all injection points local to this process, for concurrency.
+SELECT injection_points_set_local();
+ injection_points_set_local
+----------------------------
+
+(1 row)
+
+-- Use the index for all the queries
+set enable_seqscan=off;
+-- Print a NOTICE whenever an incomplete split gets fixed
+SELECT injection_points_attach('nbtree-finish-incomplete-split', 'notice');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+--
+-- First create the test table and some helper functions
+--
+create table nbtree_incomplete_splits(i int4) with (autovacuum_enabled = off);
+create index nbtree_incomplete_splits_i_idx on nbtree_incomplete_splits using btree (i);
+-- Inserts 'n' rows to the test table. Pass :next_i as the first
+-- argument, returns the new value for :next_i.
+create function insert_n(first_i int, n int) returns int language plpgsql as $$
+begin
+ insert into nbtree_incomplete_splits select g from generate_series(first_i, first_i + n - 1) as g;
+ return first_i + n;
+end;
+$$;
+-- Inserts to the table until an insert fails. Like insert_n(), returns the
+-- new value for :next_i.
+create function insert_until_fail(next_i int, step int default 1) returns int language plpgsql as $$
+declare
+ i integer;
+begin
+ -- Insert rows in batches of 'step' rows each, until an error occurs.
+ i := 0;
+ loop
+ begin
+ select insert_n(next_i, step) into next_i;
+ exception when others then
+ raise notice 'failed with: %', sqlerrm;
+ exit;
+ end;
+
+ -- The caller is expected to set an injection point that eventually
+ -- causes an error. But bail out if still no error after 10000
+ -- attempts, so that we don't get stuck in an infinite loop.
+ i := i + 1;
+ if i = 10000 then
+ raise 'no error on inserts after % iterations', i;
+ end if;
+ end loop;
+
+ return next_i;
+end;
+$$;
+-- Check the invariants.
+create function verify(next_i int) returns bool language plpgsql as $$
+declare
+ c integer;
+begin
+ -- Perform a scan over the trailing part of the index, where the
+ -- possible incomplete splits are. (We don't check the whole table,
+ -- because that'd be pretty slow.)
+ --
+ -- Find all rows that overlap with the last 200 inserted integers. Or
+ -- the next 100, which shouldn't exist.
+ select count(*) into c from nbtree_incomplete_splits where i between next_i - 200 and next_i + 100;
+ if c <> 200 then
+ raise 'unexpected count % ', c;
+ end if;
+ return true;
+end;
+$$;
+-- Insert one array to get started.
+select insert_n(1, 1000) as next_i
+\gset
+select verify(:next_i);
+ verify
+--------
+ t
+(1 row)
+
+--
+-- Test incomplete leaf split
+--
+SELECT injection_points_attach('nbtree-leave-leaf-split-incomplete', 'error');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+select insert_until_fail(:next_i) as next_i
+\gset
+NOTICE: failed with: error triggered for injection point nbtree-leave-leaf-split-incomplete
+SELECT injection_points_detach('nbtree-leave-leaf-split-incomplete');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
+-- Verify that a scan works even though there's an incomplete split
+select verify(:next_i);
+ verify
+--------
+ t
+(1 row)
+
+-- Insert some more rows, finishing the split
+select insert_n(:next_i, 10) as next_i
+\gset
+NOTICE: notice triggered for injection point nbtree-finish-incomplete-split
+-- Verify that a scan still works
+select verify(:next_i);
+ verify
+--------
+ t
+(1 row)
+
+--
+-- Test incomplete internal page split
+--
+SELECT injection_points_attach('nbtree-leave-internal-split-incomplete', 'error');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+select insert_until_fail(:next_i, 100) as next_i
+\gset
+NOTICE: failed with: error triggered for injection point nbtree-leave-internal-split-incomplete
+SELECT injection_points_detach('nbtree-leave-internal-split-incomplete');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
+ -- Verify that a scan works even though there's an incomplete split
+select verify(:next_i);
+ verify
+--------
+ t
+(1 row)
+
+-- Insert some more rows, finishing the split
+select insert_n(:next_i, 10) as next_i
+\gset
+NOTICE: notice triggered for injection point nbtree-finish-incomplete-split
+-- Verify that a scan still works
+select verify(:next_i);
+ verify
+--------
+ t
+(1 row)
+
+SELECT injection_points_detach('nbtree-finish-incomplete-split');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
diff --git a/src/test/modules/nbtree/meson.build b/src/test/modules/nbtree/meson.build
new file mode 100644
index 00000000000..efebf30a16f
--- /dev/null
+++ b/src/test/modules/nbtree/meson.build
@@ -0,0 +1,16 @@
+# Copyright (c) 2022-2025, PostgreSQL Global Development Group
+
+if not get_option('injection_points')
+ subdir_done()
+endif
+
+tests += {
+ 'name': 'nbtree',
+ 'sd': meson.current_source_dir(),
+ 'bd': meson.current_build_dir(),
+ 'regress': {
+ 'sql': [
+ 'nbtree_incomplete_splits',
+ ],
+ },
+}
diff --git a/src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql b/src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql
new file mode 100644
index 00000000000..0609ed7464e
--- /dev/null
+++ b/src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql
@@ -0,0 +1,134 @@
+--
+-- Test incomplete splits in B-tree indexes.
+--
+-- We use a test table with integers from 1 to :next_i. Each integer
+-- occurs exactly once, no gaps or duplicates, although the index does
+-- contain some duplicates because some of the inserting transactions
+-- are rolled back during the test. The exact contents of the table
+-- depend on the physical layout of the index, which in turn depends
+-- at least on the block size, so instead of checking the exact
+-- contents, we check those invariants. :next_i psql variable is
+-- maintained at all times to hold the last inserted integer + 1.
+--
+
+-- This uses injection points to cause errors that leave some page
+-- splits in "incomplete" state
+set client_min_messages TO 'warning';
+create extension if not exists injection_points;
+reset client_min_messages;
+
+-- Make all injection points local to this process, for concurrency.
+SELECT injection_points_set_local();
+
+-- Use the index for all the queries
+set enable_seqscan=off;
+
+-- Print a NOTICE whenever an incomplete split gets fixed
+SELECT injection_points_attach('nbtree-finish-incomplete-split', 'notice');
+
+--
+-- First create the test table and some helper functions
+--
+create table nbtree_incomplete_splits(i int4) with (autovacuum_enabled = off);
+
+create index nbtree_incomplete_splits_i_idx on nbtree_incomplete_splits using btree (i);
+
+-- Inserts 'n' rows to the test table. Pass :next_i as the first
+-- argument, returns the new value for :next_i.
+create function insert_n(first_i int, n int) returns int language plpgsql as $$
+begin
+ insert into nbtree_incomplete_splits select g from generate_series(first_i, first_i + n - 1) as g;
+ return first_i + n;
+end;
+$$;
+
+-- Inserts to the table until an insert fails. Like insert_n(), returns the
+-- new value for :next_i.
+create function insert_until_fail(next_i int, step int default 1) returns int language plpgsql as $$
+declare
+ i integer;
+begin
+ -- Insert rows in batches of 'step' rows each, until an error occurs.
+ i := 0;
+ loop
+ begin
+ select insert_n(next_i, step) into next_i;
+ exception when others then
+ raise notice 'failed with: %', sqlerrm;
+ exit;
+ end;
+
+ -- The caller is expected to set an injection point that eventually
+ -- causes an error. But bail out if still no error after 10000
+ -- attempts, so that we don't get stuck in an infinite loop.
+ i := i + 1;
+ if i = 10000 then
+ raise 'no error on inserts after % iterations', i;
+ end if;
+ end loop;
+
+ return next_i;
+end;
+$$;
+
+-- Check the invariants.
+create function verify(next_i int) returns bool language plpgsql as $$
+declare
+ c integer;
+begin
+ -- Perform a scan over the trailing part of the index, where the
+ -- possible incomplete splits are. (We don't check the whole table,
+ -- because that'd be pretty slow.)
+ --
+ -- Find all rows that overlap with the last 200 inserted integers. Or
+ -- the next 100, which shouldn't exist.
+ select count(*) into c from nbtree_incomplete_splits where i between next_i - 200 and next_i + 100;
+ if c <> 200 then
+ raise 'unexpected count % ', c;
+ end if;
+ return true;
+end;
+$$;
+
+-- Insert one array to get started.
+select insert_n(1, 1000) as next_i
+\gset
+select verify(:next_i);
+
+
+--
+-- Test incomplete leaf split
+--
+SELECT injection_points_attach('nbtree-leave-leaf-split-incomplete', 'error');
+select insert_until_fail(:next_i) as next_i
+\gset
+SELECT injection_points_detach('nbtree-leave-leaf-split-incomplete');
+
+-- Verify that a scan works even though there's an incomplete split
+select verify(:next_i);
+
+-- Insert some more rows, finishing the split
+select insert_n(:next_i, 10) as next_i
+\gset
+-- Verify that a scan still works
+select verify(:next_i);
+
+
+--
+-- Test incomplete internal page split
+--
+SELECT injection_points_attach('nbtree-leave-internal-split-incomplete', 'error');
+select insert_until_fail(:next_i, 100) as next_i
+\gset
+SELECT injection_points_detach('nbtree-leave-internal-split-incomplete');
+
+ -- Verify that a scan works even though there's an incomplete split
+select verify(:next_i);
+
+-- Insert some more rows, finishing the split
+select insert_n(:next_i, 10) as next_i
+\gset
+-- Verify that a scan still works
+select verify(:next_i);
+
+SELECT injection_points_detach('nbtree-finish-incomplete-split');
--
2.47.3
From 39433853ddfb67fdd3ccc2e2b083c6ea329e0f96 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 1 Dec 2025 23:18:26 +0200
Subject: [PATCH v1 2/4] Fix amcheck's handling of incomplete root splits in
B-tree
Discussion: https://www.postgresql.org/message-id/[email protected]
---
contrib/amcheck/verify_nbtree.c | 2 +-
.../modules/nbtree/expected/nbtree_incomplete_splits.out | 7 +++++++
src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql | 7 +++++++
3 files changed, 15 insertions(+), 1 deletion(-)
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index 0949c88983a..f26c20b59aa 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -721,7 +721,7 @@ bt_check_level_from_leftmost(BtreeCheckState *state, BtreeLevel level)
errmsg("block %u is not leftmost in index \"%s\"",
current, RelationGetRelationName(state->rel))));
- if (level.istruerootlevel && !P_ISROOT(opaque))
+ if (level.istruerootlevel && (!P_ISROOT(opaque) && !P_INCOMPLETE_SPLIT(opaque)))
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("block %u is not true root in index \"%s\"",
diff --git a/src/test/modules/nbtree/expected/nbtree_incomplete_splits.out b/src/test/modules/nbtree/expected/nbtree_incomplete_splits.out
index 88e87e875c8..161eb3cbfa8 100644
--- a/src/test/modules/nbtree/expected/nbtree_incomplete_splits.out
+++ b/src/test/modules/nbtree/expected/nbtree_incomplete_splits.out
@@ -14,6 +14,7 @@
-- splits in "incomplete" state
set client_min_messages TO 'warning';
create extension if not exists injection_points;
+create extension if not exists amcheck;
reset client_min_messages;
-- Make all injection points local to this process, for concurrency.
SELECT injection_points_set_local();
@@ -87,6 +88,12 @@ begin
if c <> 200 then
raise 'unexpected count % ', c;
end if;
+
+ -- Also check the index with amcheck. Both to test that the index is
+ -- valid, but also to test that amcheck doesn't wrongly complain
+ -- about incomplete splits.
+ perform bt_index_parent_check('nbtree_incomplete_splits_i_idx'::regclass, true, true);
+
return true;
end;
$$;
diff --git a/src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql b/src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql
index 0609ed7464e..f6b22786d37 100644
--- a/src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql
+++ b/src/test/modules/nbtree/sql/nbtree_incomplete_splits.sql
@@ -15,6 +15,7 @@
-- splits in "incomplete" state
set client_min_messages TO 'warning';
create extension if not exists injection_points;
+create extension if not exists amcheck;
reset client_min_messages;
-- Make all injection points local to this process, for concurrency.
@@ -86,6 +87,12 @@ begin
if c <> 200 then
raise 'unexpected count % ', c;
end if;
+
+ -- Also check the index with amcheck. Both to test that the index is
+ -- valid, but also to test that amcheck doesn't wrongly complain
+ -- about incomplete splits.
+ perform bt_index_parent_check('nbtree_incomplete_splits_i_idx'::regclass, true, true);
+
return true;
end;
$$;
--
2.47.3
From c6db30f8927e846edf02a1fee8788cff4df95f09 Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 1 Dec 2025 23:18:29 +0200
Subject: [PATCH v1 3/4] Add a test for half-dead pages in B-tree indexes
To increase our test coverage in general, and because I will use this
in the next commit to test a bug we have in amcheck at the moment.
Discussion: https://www.postgresql.org/message-id/[email protected]
---
src/backend/access/nbtree/nbtpage.c | 7 ++
.../expected/nbtree_half_dead_pages.out | 71 +++++++++++++++++++
src/test/modules/nbtree/meson.build | 1 +
.../nbtree/sql/nbtree_half_dead_pages.sql | 43 +++++++++++
4 files changed, 122 insertions(+)
create mode 100644 src/test/modules/nbtree/expected/nbtree_half_dead_pages.out
create mode 100644 src/test/modules/nbtree/sql/nbtree_half_dead_pages.sql
diff --git a/src/backend/access/nbtree/nbtpage.c b/src/backend/access/nbtree/nbtpage.c
index 30b43a4dd18..a8d56fe5a7c 100644
--- a/src/backend/access/nbtree/nbtpage.c
+++ b/src/backend/access/nbtree/nbtpage.c
@@ -33,6 +33,7 @@
#include "storage/indexfsm.h"
#include "storage/predicate.h"
#include "storage/procarray.h"
+#include "utils/injection_point.h"
#include "utils/memdebug.h"
#include "utils/memutils.h"
#include "utils/snapmgr.h"
@@ -2003,6 +2004,10 @@ _bt_pagedel(Relation rel, Buffer leafbuf, BTVacState *vstate)
return;
}
}
+ else
+ {
+ INJECTION_POINT("nbtree-finish-half-dead-page-vacuum", NULL);
+ }
/*
* Then unlink it from its siblings. Each call to
@@ -2349,6 +2354,8 @@ _bt_unlink_halfdead_page(Relation rel, Buffer leafbuf, BlockNumber scanblkno,
_bt_unlockbuf(rel, leafbuf);
+ INJECTION_POINT("nbtree-leave-page-half-dead", NULL);
+
/*
* Check here, as calling loops will have locks held, preventing
* interrupts from being processed.
diff --git a/src/test/modules/nbtree/expected/nbtree_half_dead_pages.out b/src/test/modules/nbtree/expected/nbtree_half_dead_pages.out
new file mode 100644
index 00000000000..8fd472f8df2
--- /dev/null
+++ b/src/test/modules/nbtree/expected/nbtree_half_dead_pages.out
@@ -0,0 +1,71 @@
+--
+-- Test half-dead pages in B-tree indexes.
+--
+-- Half-dead pages is an intermediate state while vacuum is deleting a
+-- page. You can encounter them if you query concurrently with vacuum,
+-- or if vacuum is interrupted while it's deleting a page. A B-tree
+-- with half-dead pages is a valid state, but they rarely observed by
+-- other backends in practice because, so it's good to have some
+-- targeted tests to exercise them.
+--
+-- This uses injection points to interrupt some page deletions
+set client_min_messages TO 'warning';
+create extension if not exists injection_points;
+reset client_min_messages;
+-- Make all injection points local to this process, for concurrency.
+SELECT injection_points_set_local();
+ injection_points_set_local
+----------------------------
+
+(1 row)
+
+-- Use the index for all the queries
+set enable_seqscan=off;
+-- Print a NOTICE whenever a half-dead page is deleted
+SELECT injection_points_attach('nbtree-finish-half-dead-page-vacuum', 'notice');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+create table nbtree_half_dead_pages(id bigint) with (autovacuum_enabled = off);
+insert into nbtree_half_dead_pages SELECT g from generate_series(1, 150000) g;
+create index nbtree_half_dead_pages_id_idx on nbtree_half_dead_pages using btree (id);
+delete from nbtree_half_dead_pages where id > 100000 and id < 120000;
+-- Run VACUUM and interrupt it so that it leaves behind a half-dead page
+SELECT injection_points_attach('nbtree-leave-page-half-dead', 'error');
+ injection_points_attach
+-------------------------
+
+(1 row)
+
+vacuum nbtree_half_dead_pages;
+ERROR: error triggered for injection point nbtree-leave-page-half-dead
+CONTEXT: while vacuuming index "nbtree_half_dead_pages_id_idx" of relation "public.nbtree_half_dead_pages"
+SELECT injection_points_detach('nbtree-leave-page-half-dead');
+ injection_points_detach
+-------------------------
+
+(1 row)
+
+select * from nbtree_half_dead_pages where id > 99998 and id < 120002;
+ id
+--------
+ 99999
+ 100000
+ 120000
+ 120001
+(4 rows)
+
+-- Finish the deletion and re-check
+vacuum nbtree_half_dead_pages;
+NOTICE: notice triggered for injection point nbtree-finish-half-dead-page-vacuum
+select * from nbtree_half_dead_pages where id > 99998 and id < 120002;
+ id
+--------
+ 99999
+ 100000
+ 120000
+ 120001
+(4 rows)
+
diff --git a/src/test/modules/nbtree/meson.build b/src/test/modules/nbtree/meson.build
index efebf30a16f..1b9a34d37ca 100644
--- a/src/test/modules/nbtree/meson.build
+++ b/src/test/modules/nbtree/meson.build
@@ -10,6 +10,7 @@ tests += {
'bd': meson.current_build_dir(),
'regress': {
'sql': [
+ 'nbtree_half_dead_pages',
'nbtree_incomplete_splits',
],
},
diff --git a/src/test/modules/nbtree/sql/nbtree_half_dead_pages.sql b/src/test/modules/nbtree/sql/nbtree_half_dead_pages.sql
new file mode 100644
index 00000000000..d4b9a3f824d
--- /dev/null
+++ b/src/test/modules/nbtree/sql/nbtree_half_dead_pages.sql
@@ -0,0 +1,43 @@
+--
+-- Test half-dead pages in B-tree indexes.
+--
+-- Half-dead pages is an intermediate state while vacuum is deleting a
+-- page. You can encounter them if you query concurrently with vacuum,
+-- or if vacuum is interrupted while it's deleting a page. A B-tree
+-- with half-dead pages is a valid state, but they rarely observed by
+-- other backends in practice because, so it's good to have some
+-- targeted tests to exercise them.
+--
+
+-- This uses injection points to interrupt some page deletions
+set client_min_messages TO 'warning';
+create extension if not exists injection_points;
+reset client_min_messages;
+
+-- Make all injection points local to this process, for concurrency.
+SELECT injection_points_set_local();
+
+-- Use the index for all the queries
+set enable_seqscan=off;
+
+-- Print a NOTICE whenever a half-dead page is deleted
+SELECT injection_points_attach('nbtree-finish-half-dead-page-vacuum', 'notice');
+
+create table nbtree_half_dead_pages(id bigint) with (autovacuum_enabled = off);
+
+insert into nbtree_half_dead_pages SELECT g from generate_series(1, 150000) g;
+
+create index nbtree_half_dead_pages_id_idx on nbtree_half_dead_pages using btree (id);
+
+delete from nbtree_half_dead_pages where id > 100000 and id < 120000;
+
+-- Run VACUUM and interrupt it so that it leaves behind a half-dead page
+SELECT injection_points_attach('nbtree-leave-page-half-dead', 'error');
+vacuum nbtree_half_dead_pages;
+SELECT injection_points_detach('nbtree-leave-page-half-dead');
+
+select * from nbtree_half_dead_pages where id > 99998 and id < 120002;
+
+-- Finish the deletion and re-check
+vacuum nbtree_half_dead_pages;
+select * from nbtree_half_dead_pages where id > 99998 and id < 120002;
--
2.47.3
From 5d765cda1a35450cc205b9b9576e045b5145a12c Mon Sep 17 00:00:00 2001
From: Heikki Linnakangas <[email protected]>
Date: Mon, 1 Dec 2025 23:18:31 +0200
Subject: [PATCH v1 4/4] Fix amcheck's handling of half-dead B-tree pages
Discussion: https://www.postgresql.org/message-id/[email protected]
---
contrib/amcheck/verify_nbtree.c | 2 +-
.../nbtree/expected/nbtree_half_dead_pages.out | 14 ++++++++++++++
.../modules/nbtree/sql/nbtree_half_dead_pages.sql | 5 +++++
3 files changed, 20 insertions(+), 1 deletion(-)
diff --git a/contrib/amcheck/verify_nbtree.c b/contrib/amcheck/verify_nbtree.c
index f26c20b59aa..75751e2a1e9 100644
--- a/contrib/amcheck/verify_nbtree.c
+++ b/contrib/amcheck/verify_nbtree.c
@@ -2268,7 +2268,7 @@ bt_child_highkey_check(BtreeCheckState *state,
* If we visit page with high key, check that it is equal to the
* target key next to corresponding downlink.
*/
- if (!rightsplit && !P_RIGHTMOST(opaque))
+ if (!rightsplit && !P_RIGHTMOST(opaque) && !P_ISHALFDEAD(opaque))
{
BTPageOpaque topaque;
IndexTuple highkey;
diff --git a/src/test/modules/nbtree/expected/nbtree_half_dead_pages.out b/src/test/modules/nbtree/expected/nbtree_half_dead_pages.out
index 8fd472f8df2..e94f016696d 100644
--- a/src/test/modules/nbtree/expected/nbtree_half_dead_pages.out
+++ b/src/test/modules/nbtree/expected/nbtree_half_dead_pages.out
@@ -11,6 +11,7 @@
-- This uses injection points to interrupt some page deletions
set client_min_messages TO 'warning';
create extension if not exists injection_points;
+create extension if not exists amcheck;
reset client_min_messages;
-- Make all injection points local to this process, for concurrency.
SELECT injection_points_set_local();
@@ -57,6 +58,13 @@ select * from nbtree_half_dead_pages where id > 99998 and id < 120002;
120001
(4 rows)
+-- Also check the index with amcheck
+select bt_index_parent_check('nbtree_half_dead_pages_id_idx'::regclass, true, true);
+ bt_index_parent_check
+-----------------------
+
+(1 row)
+
-- Finish the deletion and re-check
vacuum nbtree_half_dead_pages;
NOTICE: notice triggered for injection point nbtree-finish-half-dead-page-vacuum
@@ -69,3 +77,9 @@ select * from nbtree_half_dead_pages where id > 99998 and id < 120002;
120001
(4 rows)
+select bt_index_parent_check('nbtree_half_dead_pages_id_idx'::regclass, true, true);
+ bt_index_parent_check
+-----------------------
+
+(1 row)
+
diff --git a/src/test/modules/nbtree/sql/nbtree_half_dead_pages.sql b/src/test/modules/nbtree/sql/nbtree_half_dead_pages.sql
index d4b9a3f824d..fd279b87e0e 100644
--- a/src/test/modules/nbtree/sql/nbtree_half_dead_pages.sql
+++ b/src/test/modules/nbtree/sql/nbtree_half_dead_pages.sql
@@ -12,6 +12,7 @@
-- This uses injection points to interrupt some page deletions
set client_min_messages TO 'warning';
create extension if not exists injection_points;
+create extension if not exists amcheck;
reset client_min_messages;
-- Make all injection points local to this process, for concurrency.
@@ -38,6 +39,10 @@ SELECT injection_points_detach('nbtree-leave-page-half-dead');
select * from nbtree_half_dead_pages where id > 99998 and id < 120002;
+-- Also check the index with amcheck
+select bt_index_parent_check('nbtree_half_dead_pages_id_idx'::regclass, true, true);
+
-- Finish the deletion and re-check
vacuum nbtree_half_dead_pages;
select * from nbtree_half_dead_pages where id > 99998 and id < 120002;
+select bt_index_parent_check('nbtree_half_dead_pages_id_idx'::regclass, true, true);
--
2.47.3