On Sun, Jun 15, 2025 at 4:24 PM Andrey Borodin <x4...@yandex-team.ru> wrote:
>
>
> Hi Arseniy!
>
> Thanks for finding these problems.
> I had several attempts to wrap my head around original patch with fixes, but 
> when it was broken into several steps it finally became easier for me.
> Here are some thought about patches.
>

Hi Andrey! Thank you for the review.

>
> > On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin....@gmail.com> 
> > wrote:
> > <0001-amcheck-Add-gin_index_check-on-a-multicolumn-index.patch>
>
> The test seems harmless and nice to have. I understand that this test is 
> needed to extend coverage.
> Perhaps, we could verify that some code is actually triggered. Personally, I 
> would be happy if we could some add injection points with notices at tested 
> branches. But, AFAIK, it's too much of a burden to have injection points in 
> contrib extensions. We had very similar problem with sort patch in btree_gist 
> and eventually gave up. elog(DEBUG) was not a good solution too, because it 
> was unstable.
> See 'gin-finish-incomplete-split' or 'hash-aggregate-enter-spill-mode' for 
> reference.

I'm not familiar with injections points much, but I think I got the
idea, sounds interesting. Thank you for the references.

>
> Having some TAP tests sounds like a very good idea.
>
> I'm a bit surprised by excluding some letters from random_string(), but 
> perhaps it's fine for this test.
>

Yeah, there is no reason why we can't use vowels here, so I will add
them so that it doesn't look like there is any point in their absence.

> Somewhere here:
> +               INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || 
> random_string(1870) ||'}')::text[]);
> I'd like to have a comment explaining number 1870. And, probably, you expect 
> exactly 2 tuples on root page, right?
>

The idea behind "random_string(1870)" was to get split as fast as
possible, but tuples with size > 2kb are toasted, so we have to use
something about 2k here. I think I took 1870 from some other place
where it was necessary, but here we can round it to 1900. So I'll
replace 1870 with 1900 and add a comment about the size. Also gonna
add some comments about datasets in some tests to make it more clear.

> Are we 100% certain that 'rrrrrrrrr' will always be on root page?

I'm not 100% sure. AFAIK the split algorithm is deterministic and the
idea was that if we use very long tuples, then all other factors will
be too small to influence what key we will see on the root page.

> I do not see much value in having variables $relname and $indexname. I'd just 
> substitute its usages with literals. But I'm not sure, maybe this structure 
> will be used in your tests later...

I added variables just because we use index name and table name
several times, but I don't mind getting rid of them.

>
> In this function
> +sub string_replace_block
> I'd suggest a little bit of comments. Also, perhaps, fsync of files, but 
> 001_verify_heapam.pl does not do fsync. So, maybe it's OK here too.
>

Will add a comment here.

> Also, I have a wild idea. Maybe add an assert that block size if 8192 and 
> just exit otherwise?

I like the idea. I thought maybe it would be great to have some
function that every TAP test can use if it needs a certain block size?

> And, maybe instead of gin_clean_pending_list() you can just create an index 
> with fastupdate=off.

Yeah, I think we can do it even simpler if we move index creation to
the end as regression tests do.

>
> > On 10 Jun 2025, at 13:18, Arseniy Mukhin <arseniy.mukhin....@gmail.com> 
> > wrote:
> > <0003-patch-2-gin_check_parent_keys_consistency.patch>
>
> The patch seems correct to me.
> Except this
> +       my $blkno = 5;  # leaf
> in test reads scary. Will it be stable on buildfarm?
>

Not sure, but I thought that blkno should be more or less the same everywhere.

>
> > On 9 May 2025, at 17:43, Arseniy Mukhin <arseniy.mukhin....@gmail.com> 
> > wrote:
> >
> > 10) README says "Vacuum never deletes tuples or pages from the entry
> > tree." But check assumes that it's possible to have
> > deleted leaf page with 0 entries.
> >
> >    if (GinPageIsDeleted(page))
> >    {
> >       if (!GinPageIsLeaf(page))
> >          ereport(ERROR,
> >                (errcode(ERRCODE_INDEX_CORRUPTED),
> >                 errmsg("index \"%s\" has deleted internal page %u",
> >                      RelationGetRelationName(rel), blockNo)));
> >       if (PageGetMaxOffsetNumber(page) > InvalidOffsetNumber)
> >          ereport(ERROR,
> >                (errcode(ERRCODE_INDEX_CORRUPTED),
> >                 errmsg("index \"%s\" has deleted page %u with tuples",
> >                      RelationGetRelationName(rel), blockNo)));
> >    }
>
> To enforce such an invariant we must be sure that GIN never deleted entry 
> pages in older versions. I do not have enough knowledge of the history for 
> this.

Agree, good point.

>
> > 11) When we compare entry tree max page key with parent key:
> >
> >             if (ginCompareAttEntries(&state, attnum, current_key,
> >                              current_key_category, parent_key_attnum,
> >                                      parent_key, parent_key_category) > 0)
> >             {
> >                /*
> >                 * There was a discrepancy between parent and child
> >                 * tuples. We need to verify it is not a result of
> >                 * concurrent call of gistplacetopage(). So, lock parent
> >                 * and try to find downlink for current page. It may be
> >                 * missing due to concurrent page split, this is OK.
> >                 */
> >                pfree(stack->parenttup);
> >                stack->parenttup = gin_refind_parent(rel, stack->parentblk,
> >                                            stack->blkno, strategy);
> >
> > I think we can remove gin_refind_parent() and do ereport right away here.
> > The same logic as with 3). AFAIK it's impossible to have a child item
> > with a key that is higher than the cached parent key.
> > Parent key bounds what keys we can insert into the child page, so it
> > seems there is no way how they can appear there.
> >
>
> This logic was copied from GiST check. In GiST "Area of responsibility" of 
> internal tuple can be extended in any direction. That's why we need to lock 
> parent page.
> If in GIN internal tuple keyspace is never extended - it's OK to avoid 
> gin_refind_parent().
> But reasoning about GIN concurrency is rather difficult. Unfortunately, we do 
> not have such checks in B-tree verification without ShareLock. Either way we 
> could peep some idea from there.
>

Got it.


Here is the new version. I fixed some points that Andrey mentioned.
All of them in the TAP test. Several comments were added, filler size
1870 changed to 1900. Also I added vowels to the replace function and
moved index creation after the data filling. Thank you!


Best regards,
Arseniy Mukhin
From 64364caee85683f73818628decd066b33f01a7dd Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@vondra.me>
Date: Mon, 9 Jun 2025 01:44:59 +0200
Subject: [PATCH v7 2/5] patch 1: gin_check_parent_keys_consistency

This commit address several issues:
- Current code doesn't take into account attribute number when comparing index entries, but for multicolumn indexes attnum defines order.
- Add check that root page entries are ordered (there is nothing special about the root page that would prevent us from checking the order)
- Add checking order for the rightmost entry of the leaf level and skip it only for inner level.
- Split detection fix: in the case of a split, the cached parent key is greater than the current child key, not less.
---
 contrib/amcheck/meson.build         |   1 +
 contrib/amcheck/t/006_verify_gin.pl | 199 ++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  37 +++---
 3 files changed, 219 insertions(+), 18 deletions(-)
 create mode 100644 contrib/amcheck/t/006_verify_gin.pl

diff --git a/contrib/amcheck/meson.build b/contrib/amcheck/meson.build
index b33e8c9b062..1f0c347ed54 100644
--- a/contrib/amcheck/meson.build
+++ b/contrib/amcheck/meson.build
@@ -49,6 +49,7 @@ tests += {
       't/003_cic_2pc.pl',
       't/004_verify_nbtree_unique.pl',
       't/005_pitr.pl',
+      't/006_verify_gin.pl',
     ],
   },
 }
diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
new file mode 100644
index 00000000000..e73c9d9f92e
--- /dev/null
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -0,0 +1,199 @@
+
+# Copyright (c) 2021-2025, PostgreSQL Global Development Group
+
+use strict;
+use warnings FATAL => 'all';
+
+use PostgreSQL::Test::Cluster;
+use PostgreSQL::Test::Utils;
+
+use Test::More;
+
+my $node;
+my $blksize;
+
+# to get the split fast, we want tuples to be as large as possible, but the same time we don't want them to be toasted.
+my $filler_size = 1900;
+
+#
+# Test set-up
+#
+$node = PostgreSQL::Test::Cluster->new('test');
+$node->init(no_data_checksums => 1);
+$node->append_conf('postgresql.conf', 'autovacuum=off');
+$node->start;
+$blksize = int($node->safe_psql('postgres', 'SHOW block_size;'));
+$node->safe_psql('postgres', q(CREATE EXTENSION amcheck));
+$node->safe_psql(
+	'postgres', q(
+		CREATE OR REPLACE FUNCTION  random_string( INT ) RETURNS text AS $$
+		SELECT string_agg(substring('0123456789abcdefghijklmnopqrstuvwxyz', ceil(random() * 36)::integer, 1), '') from generate_series(1, $1);
+		$$ LANGUAGE SQL;));
+
+# Tests
+invalid_entry_order_leaf_page_test();
+invalid_entry_order_inner_page_test();
+invalid_entry_columns_order_test();
+
+sub invalid_entry_order_leaf_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES ('{aaaaa,bbbbb}');
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# produce wrong order by replacing aaaaa with ccccc
+	string_replace_block(
+		$relpath,
+		'aaaaa',
+		'ccccc',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+sub invalid_entry_order_inner_page_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# to break the order in the inner page we need at least 3 items (rightmost key in the inner level is not checked for the order)
+	# so fill table until we have 2 splits
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'pppppppppp' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'qqqqqqqqqq' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'rrrrrrrrrr' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'ssssssssss' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'tttttttttt' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'uuuuuuuuuu' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'vvvvvvvvvv' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'wwwwwwwwww' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have rrrrrrrrr... and tttttttttt... as keys in the root, so produce wrong order by replacing rrrrrrrrrr....
+	string_replace_block(
+		$relpath,
+		'rrrrrrrrrr',
+		'zzzzzzzzzz',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+sub invalid_entry_columns_order_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[],b text[]);
+		INSERT INTO $relname (a,b) VALUES ('{aaa}','{bbb}');
+		CREATE INDEX $indexname ON $relname USING gin (a,b);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# mess column numbers
+	# root items order before: (1,aaa), (2,bbb)
+	# root items order after:  (2,aaa), (1,bbb)
+	my $attrno_1 = pack('s', 1);
+	my $attrno_2 = pack('s', 2);
+
+	my $find = qr/($attrno_1)(.)(aaa)/s;
+	my $replace = $attrno_2 . '$2$3';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$find = qr/($attrno_2)(.)(bbb)/s;
+	$replace = $attrno_1 . '$2$3';
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has wrong tuple order on entry tree page, block 1, offset 2, rightlink 4294967295";
+	like($stderr, qr/$expected/);
+}
+
+# Returns the filesystem path for the named relation.
+sub relation_filepath
+{
+	my ($relname) = @_;
+
+	my $pgdata = $node->data_dir;
+	my $rel = $node->safe_psql('postgres',
+		qq(SELECT pg_relation_filepath('$relname')));
+	die "path not found for relation $relname" unless defined $rel;
+	return "$pgdata/$rel";
+}
+
+# substitute pattern 'find' with 'replace' within the block with number 'blkno' in the file 'filename'
+sub string_replace_block
+{
+	my ($filename, $find, $replace, $blkno) = @_;
+
+	my $fh;
+	open($fh, '+<', $filename) or BAIL_OUT("open failed: $!");
+	binmode $fh;
+
+	my $offset = $blkno * $blksize;
+	my $buffer;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	sysread($fh, $buffer, $blksize) or BAIL_OUT("read failed: $!");
+
+	$buffer =~ s/$find/'"' . $replace . '"'/gee;
+
+	sysseek($fh, $offset, 0) or BAIL_OUT("seek failed: $!");
+	syswrite($fh, $buffer) or BAIL_OUT("write failed: $!");
+
+	close($fh) or BAIL_OUT("close failed: $!");
+
+	return;
+}
+
+done_testing();
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index b5f363562e3..26b98571b56 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -463,17 +463,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			Datum		parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
+			OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 			ItemId		iid = PageGetItemIdCareful(rel, stack->blkno,
 												   page, maxoff);
 			IndexTuple	idxtuple = (IndexTuple) PageGetItem(page, iid);
-			OffsetNumber attnum = gintuple_get_attrnum(&state, idxtuple);
+			OffsetNumber page_max_key_attnum = gintuple_get_attrnum(&state, idxtuple);
 			GinNullCategory page_max_key_category;
 			Datum		page_max_key = gintuple_get_key(&state, idxtuple, &page_max_key_category);
 
 			if (rightlink != InvalidBlockNumber &&
-				ginCompareEntries(&state, attnum, page_max_key,
-								  page_max_key_category, parent_key,
-								  parent_key_category) > 0)
+				ginCompareAttEntries(&state, page_max_key_attnum, page_max_key,
+									 page_max_key_category, parent_key_attnum,
+									 parent_key, parent_key_category) < 0)
 			{
 				/* split page detected, install right link to the stack */
 				GinScanItem *ptr;
@@ -528,20 +529,18 @@ gin_check_parent_keys_consistency(Relation rel,
 			current_key = gintuple_get_key(&state, idxtuple, &current_key_category);
 
 			/*
-			 * First block is metadata, skip order check. Also, never check
-			 * for high key on rightmost page, as this key is not really
-			 * stored explicitly.
+			 * Never check for high key on rightmost inner page, as this key
+			 * is not really stored explicitly.
 			 *
 			 * Also make sure to not compare entries for different attnums,
 			 * which may be stored on the same page.
 			 */
-			if (i != FirstOffsetNumber && attnum == prev_attnum && stack->blkno != GIN_ROOT_BLKNO &&
-				!(i == maxoff && rightlink == InvalidBlockNumber))
+			if (i != FirstOffsetNumber && !(i == maxoff && rightlink == InvalidBlockNumber && !GinPageIsLeaf(page)))
 			{
 				prev_key = gintuple_get_key(&state, prev_tuple, &prev_key_category);
-				if (ginCompareEntries(&state, attnum, prev_key,
-									  prev_key_category, current_key,
-									  current_key_category) >= 0)
+				if (ginCompareAttEntries(&state, prev_attnum, prev_key,
+										 prev_key_category, attnum,
+										 current_key, current_key_category) >= 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
 							 errmsg("index \"%s\" has wrong tuple order on entry tree page, block %u, offset %u, rightlink %u",
@@ -556,13 +555,14 @@ gin_check_parent_keys_consistency(Relation rel,
 				i == maxoff)
 			{
 				GinNullCategory parent_key_category;
+				OffsetNumber parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 				Datum		parent_key = gintuple_get_key(&state,
 														  stack->parenttup,
 														  &parent_key_category);
 
-				if (ginCompareEntries(&state, attnum, current_key,
-									  current_key_category, parent_key,
-									  parent_key_category) > 0)
+				if (ginCompareAttEntries(&state, attnum, current_key,
+										 current_key_category, parent_key_attnum,
+										 parent_key, parent_key_category) > 0)
 				{
 					/*
 					 * There was a discrepancy between parent and child
@@ -581,6 +581,7 @@ gin_check_parent_keys_consistency(Relation rel,
 							 stack->blkno, stack->parentblk);
 					else
 					{
+						parent_key_attnum = gintuple_get_attrnum(&state, stack->parenttup);
 						parent_key = gintuple_get_key(&state,
 													  stack->parenttup,
 													  &parent_key_category);
@@ -589,9 +590,9 @@ gin_check_parent_keys_consistency(Relation rel,
 						 * Check if it is properly adjusted. If succeed,
 						 * proceed to the next key.
 						 */
-						if (ginCompareEntries(&state, attnum, current_key,
-											  current_key_category, parent_key,
-											  parent_key_category) > 0)
+						if (ginCompareAttEntries(&state, attnum, current_key,
+												 current_key_category, parent_key_attnum,
+												 parent_key, parent_key_category) > 0)
 							ereport(ERROR,
 									(errcode(ERRCODE_INDEX_CORRUPTED),
 									 errmsg("index \"%s\" has inconsistent records on page %u offset %u",
-- 
2.43.0

From 79d67a965baca64e45cd9d08ef37576bfee43ff7 Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin....@gmail.com>
Date: Mon, 9 Jun 2025 20:39:13 +0300
Subject: [PATCH v7 3/5] patch 2: gin_check_parent_keys_consistency

This commit address issues with parent_key checks for the entry tree in gin_index_check():
- parenttup is always NULL. We want to set parenttup to NULL for the rightmost tuple of the level only, in all other cases it should be valid parent_key.
- use GinGetDownlink while retrieving child blkno to avoid triggering Assert (as core GIN code does).
---
 contrib/amcheck/t/006_verify_gin.pl | 78 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        |  8 +--
 2 files changed, 82 insertions(+), 4 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index e73c9d9f92e..b997fc85f86 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -34,6 +34,8 @@ $node->safe_psql(
 invalid_entry_order_leaf_page_test();
 invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
+inconsistent_with_parent_key__parent_key_corrupted_test();
+inconsistent_with_parent_key__child_key_corrupted_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -159,6 +161,82 @@ sub invalid_entry_columns_order_test
 	like($stderr, qr/$expected/);
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# fill the table until we have a split
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 1;  # root
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace it with something smaller then child's keys
+	string_replace_block(
+		$relpath,
+		'nnnnnnnnnn',
+		'aaaaaaaaaa',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has inconsistent records on page 3 offset 3";
+	like($stderr, qr/$expected/);
+}
+
+sub inconsistent_with_parent_key__child_key_corrupted_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	# fill the table until we have a split
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'llllllllll' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'mmmmmmmmmm' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'nnnnnnnnnn' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'xxxxxxxxxx' || random_string($filler_size) ||'}')::text[]);
+		INSERT INTO $relname (a) VALUES (('{' || 'yyyyyyyyyy' || random_string($filler_size) ||'}')::text[]);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	 ));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 3;  # leaf
+
+	# we have nnnnnnnnnn... as parent key in the root, so replace child key with something bigger
+	string_replace_block(
+		$relpath,
+		'nnnnnnnnnn',
+		'pppppppppp',
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\" has inconsistent records on page 3 offset 3";
+	like($stderr, qr/$expected/);
+}
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 26b98571b56..8f6a5410cb7 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -609,10 +609,10 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr = (GinScanItem *) palloc(sizeof(GinScanItem));
 				ptr->depth = stack->depth + 1;
 				/* last tuple in layer has no high key */
-				if (i != maxoff && !GinPageGetOpaque(page)->rightlink)
-					ptr->parenttup = CopyIndexTuple(idxtuple);
-				else
+				if (i == maxoff && rightlink == InvalidBlockNumber)
 					ptr->parenttup = NULL;
+				else
+					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
 				ptr->parentlsn = lsn;
@@ -750,7 +750,7 @@ gin_refind_parent(Relation rel, BlockNumber parentblkno,
 		ItemId		p_iid = PageGetItemIdCareful(rel, parentblkno, parentpage, o);
 		IndexTuple	itup = (IndexTuple) PageGetItem(parentpage, p_iid);
 
-		if (ItemPointerGetBlockNumber(&(itup->t_tid)) == childblkno)
+		if (GinGetDownlink(itup) == childblkno)
 		{
 			/* Found it! Make copy and return it */
 			result = CopyIndexTuple(itup);
-- 
2.43.0

From b7b487b2d1dd97fbf6062674515711b9c2e4409e Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin....@gmail.com>
Date: Mon, 9 Jun 2025 20:49:11 +0300
Subject: [PATCH v7 4/5] patch 3:
 gin_check_posting_tree_parent_keys_consistency

This commit address issues with parent_key checks for the posting tree in gin_index_check():
- Check if parentkey is valid before using it in the comparison. Checking that stack->parentblk is valid blocknum is not enough
- Set invalid parentkey for the rightmost child only, not for all children of the rightmost page.
---
 contrib/amcheck/t/006_verify_gin.pl | 39 +++++++++++++++++++++++++++++
 contrib/amcheck/verify_gin.c        | 12 +++------
 2 files changed, 43 insertions(+), 8 deletions(-)

diff --git a/contrib/amcheck/t/006_verify_gin.pl b/contrib/amcheck/t/006_verify_gin.pl
index b997fc85f86..ff857aa9886 100644
--- a/contrib/amcheck/t/006_verify_gin.pl
+++ b/contrib/amcheck/t/006_verify_gin.pl
@@ -36,6 +36,7 @@ invalid_entry_order_inner_page_test();
 invalid_entry_columns_order_test();
 inconsistent_with_parent_key__parent_key_corrupted_test();
 inconsistent_with_parent_key__child_key_corrupted_test();
+inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test();
 
 sub invalid_entry_order_leaf_page_test
 {
@@ -237,6 +238,44 @@ sub inconsistent_with_parent_key__child_key_corrupted_test
 	like($stderr, qr/$expected/);
 }
 
+sub inconsistent_with_parent_key__parent_key_corrupted_posting_tree_test
+{
+	my $relname = "test";
+	my $indexname = "test_gin_idx";
+
+	$node->safe_psql(
+		'postgres', qq(
+		DROP TABLE IF EXISTS $relname;
+		CREATE TABLE $relname (a text[]);
+		INSERT INTO $relname (a) select ('{aaaaa}') from generate_series(1,10000);
+		CREATE INDEX $indexname ON $relname USING gin (a);
+	));
+	my $relpath = relation_filepath($indexname);
+
+	$node->stop;
+
+	my $blkno = 2;  # posting tree root
+
+	# we have a posting tree for 'aaaaa' key with the root at 2nd block
+	# and two leaf pages 3 and 4. replace 4th page's high key with (1,1)
+	# so that there are tid's in leaf page that are larger then the new high key.
+	my $find = pack('S*', 0, 4, 0) . '....';
+	my $replace = pack('S*', 0, 4, 0, 1, 1);
+	string_replace_block(
+		$relpath,
+		$find,
+		$replace,
+		$blkno
+	);
+
+	$node->start;
+
+	my ($result, $stdout, $stderr) = $node->psql('postgres', qq(SELECT gin_index_check('$indexname')));
+	my $expected = "index \"$indexname\": tid exceeds parent's high key in postingTree leaf on block 4";
+	like($stderr, qr/$expected/);
+}
+
+
 # Returns the filesystem path for the named relation.
 sub relation_filepath
 {
diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 8f6a5410cb7..427cf1669a6 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -346,7 +346,7 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				 * Check if this tuple is consistent with the downlink in the
 				 * parent.
 				 */
-				if (stack->parentblk != InvalidBlockNumber && i == maxoff &&
+				if (i == maxoff && ItemPointerIsValid(&stack->parentkey) &&
 					ItemPointerCompare(&stack->parentkey, &posting_item->key) < 0)
 					ereport(ERROR,
 							(errcode(ERRCODE_INDEX_CORRUPTED),
@@ -359,14 +359,10 @@ gin_check_posting_tree_parent_keys_consistency(Relation rel, BlockNumber posting
 				ptr->depth = stack->depth + 1;
 
 				/*
-				 * Set rightmost parent key to invalid item pointer. Its value
-				 * is 'Infinity' and not explicitly stored.
+				 * The rightmost parent key is always invalid item pointer.
+				 * Its value is 'Infinity' and not explicitly stored.
 				 */
-				if (rightlink == InvalidBlockNumber)
-					ItemPointerSetInvalid(&ptr->parentkey);
-				else
-					ptr->parentkey = posting_item->key;
-
+				ptr->parentkey = posting_item->key;
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = BlockIdGetBlockNumber(&posting_item->child_blkno);
 				ptr->next = stack->next;
-- 
2.43.0

From 4344b1b27a554139db20c3a1ba6f9250a577a45d Mon Sep 17 00:00:00 2001
From: Arseniy Mukhin <arseniy.mukhin....@gmail.com>
Date: Mon, 9 Jun 2025 20:51:05 +0300
Subject: [PATCH v7 5/5] patch 4: remove unused parentlsn

This commit removes unused field parentlsn in gin_index_check()
---
 contrib/amcheck/verify_gin.c | 6 ------
 1 file changed, 6 deletions(-)

diff --git a/contrib/amcheck/verify_gin.c b/contrib/amcheck/verify_gin.c
index 427cf1669a6..f49e27b1a30 100644
--- a/contrib/amcheck/verify_gin.c
+++ b/contrib/amcheck/verify_gin.c
@@ -38,7 +38,6 @@ typedef struct GinScanItem
 	int			depth;
 	IndexTuple	parenttup;
 	BlockNumber parentblk;
-	XLogRecPtr	parentlsn;
 	BlockNumber blkno;
 	struct GinScanItem *next;
 } GinScanItem;
@@ -417,7 +416,6 @@ gin_check_parent_keys_consistency(Relation rel,
 	stack->depth = 0;
 	stack->parenttup = NULL;
 	stack->parentblk = InvalidBlockNumber;
-	stack->parentlsn = InvalidXLogRecPtr;
 	stack->blkno = GIN_ROOT_BLKNO;
 
 	while (stack)
@@ -428,7 +426,6 @@ gin_check_parent_keys_consistency(Relation rel,
 		OffsetNumber i,
 					maxoff,
 					prev_attnum;
-		XLogRecPtr	lsn;
 		IndexTuple	prev_tuple;
 		BlockNumber rightlink;
 
@@ -438,7 +435,6 @@ gin_check_parent_keys_consistency(Relation rel,
 									RBM_NORMAL, strategy);
 		LockBuffer(buffer, GIN_SHARE);
 		page = (Page) BufferGetPage(buffer);
-		lsn = BufferGetLSNAtomic(buffer);
 		maxoff = PageGetMaxOffsetNumber(page);
 		rightlink = GinPageGetOpaque(page)->rightlink;
 
@@ -481,7 +477,6 @@ gin_check_parent_keys_consistency(Relation rel,
 				ptr->depth = stack->depth;
 				ptr->parenttup = CopyIndexTuple(stack->parenttup);
 				ptr->parentblk = stack->parentblk;
-				ptr->parentlsn = stack->parentlsn;
 				ptr->blkno = rightlink;
 				ptr->next = stack->next;
 				stack->next = ptr;
@@ -611,7 +606,6 @@ gin_check_parent_keys_consistency(Relation rel,
 					ptr->parenttup = CopyIndexTuple(idxtuple);
 				ptr->parentblk = stack->blkno;
 				ptr->blkno = GinGetDownlink(idxtuple);
-				ptr->parentlsn = lsn;
 				ptr->next = stack->next;
 				stack->next = ptr;
 			}
-- 
2.43.0

From 426895c13c77eab3e988a49da19434c0888323e0 Mon Sep 17 00:00:00 2001
From: Tomas Vondra <to...@vondra.me>
Date: Mon, 9 Jun 2025 01:42:52 +0200
Subject: [PATCH v7 1/5] amcheck: Add gin_index_check on a multicolumn index

This commit adds a regression test to verify that gin_index_check()
correctly handles multicolumn indexes, increases its test coverage.

Author: Arseniy Mukhin <arseniy.mukhin....@gmail.com>
Discussion: https://postgr.es/m/CAE7r3MJ611B9TE=yqbbncewp7-k64vws+sjk7xf6fjux77u...@mail.gmail.com
---
 contrib/amcheck/expected/check_gin.out | 12 ++++++++++++
 contrib/amcheck/sql/check_gin.sql      | 10 ++++++++++
 2 files changed, 22 insertions(+)

diff --git a/contrib/amcheck/expected/check_gin.out b/contrib/amcheck/expected/check_gin.out
index b4f0b110747..8dd01ced8d1 100644
--- a/contrib/amcheck/expected/check_gin.out
+++ b/contrib/amcheck/expected/check_gin.out
@@ -76,3 +76,15 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+SELECT gin_index_check('gin_check_multicolumn_idx');
+ gin_index_check 
+-----------------
+ 
+(1 row)
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
diff --git a/contrib/amcheck/sql/check_gin.sql b/contrib/amcheck/sql/check_gin.sql
index 66f42c34311..11caed3d6a8 100644
--- a/contrib/amcheck/sql/check_gin.sql
+++ b/contrib/amcheck/sql/check_gin.sql
@@ -50,3 +50,13 @@ SELECT gin_index_check('gin_check_jsonb_idx');
 
 -- cleanup
 DROP TABLE gin_check_jsonb;
+
+-- Test GIN multicolumn index
+CREATE TABLE "gin_check_multicolumn"(a text[], b text[]);
+INSERT INTO gin_check_multicolumn (a,b) values ('{a,c,e}','{b,d,f}');
+CREATE INDEX "gin_check_multicolumn_idx" on gin_check_multicolumn USING GIN(a,b);
+
+SELECT gin_index_check('gin_check_multicolumn_idx');
+
+-- cleanup
+DROP TABLE gin_check_multicolumn;
-- 
2.43.0

Reply via email to