05.04.2019 19:41, Anastasia Lubennikova writes:

05.04.2019 18:01, Tom Lane writes:
Andrew Dunstan <andrew.duns...@2ndquadrant.com> writes:
On Fri, Apr 5, 2019 at 2:02 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
This is a strange failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=loach&dt=2019-04-05%2005%3A15%3A00
[ wrong answers from queries using a GIST index ]
There are a couple of other recent instances of this failure, on
francolin and whelk.
Yeah.  Given three failures in a couple of days, we can reasonably
guess that the problem was introduced within a day or two prior to
the first one.  Looking at what's touched GIST in that time frame,
suspicion has to fall heavily on 9155580fd5fc2a0cbb23376dfca7cd21f59c2c7b.

If I had to bet, I'd bet that there's something wrong with the
machinations described in the commit message:
          For GiST, the LSN-NSN interlock makes this a little tricky. All pages must      be marked with a valid (i.e. non-zero) LSN, so that the parent-child      LSN-NSN interlock works correctly. We now use magic value 1 for that during      index build. Change the fake LSN counter to begin from 1000, so that 1 is      safely smaller than any real or fake LSN. 2 would've been enough for our      purposes, but let's reserve a bigger range, in case we need more special
     values in the future.

I'll go add this as an open issue.

            regards, tom lane


Hi,
I've already noticed the same failure in our company buildfarm and started the research.

You are right, it's the " Generate less WAL during GiST, GIN and SP-GiST index build. " patch to blame. Because of using the GistBuildLSN some pages are not linked correctly, so index scan cannot find some entries, while seqscan finds them.

In attachment, you can find patch with a test that allows to reproduce the bug not randomly, but on every run.
Now I'm trying to find a way to fix the issue.

The problem was caused by incorrect detection of the page to insert new tuple after split. If gistinserttuple() of the tuple formed by gistgetadjusted() had to split the page, we must to go back to the parent and
descend back to the child that's a better fit for the new tuple.

Previously this was handled by the code block with the following comment:

* Concurrent split detected. There's no guarantee that the
* downlink for this page is consistent with the tuple we're
* inserting anymore, so go back to parent and rechoose the best
* child.

After introducing GistBuildNSN this code path became unreachable.
To fix it, I added new flag to detect such splits during indexbuild.

The patches with the test and fix are attached.

Many thanks to Teodor Sigaev, who helped to find the bug.

--

Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit 53938b423ec19a88e470b7b9552de13de5e2634a
Author: Anastasia <a.lubennik...@postgrespro.ru>
Date:   Tue Apr 9 18:58:00 2019 +0300

    fix for gist create index WAL optimization. Track splits correctly

diff --git a/src/backend/access/gist/gist.c b/src/backend/access/gist/gist.c
index 2db790c..ff98d13 100644
--- a/src/backend/access/gist/gist.c
+++ b/src/backend/access/gist/gist.c
@@ -639,6 +639,7 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
 	/* Start from the root */
 	firststack.blkno = GIST_ROOT_BLKNO;
 	firststack.lsn = 0;
+	firststack.retry_from_parent = false;
 	firststack.parent = NULL;
 	firststack.downlinkoffnum = InvalidOffsetNumber;
 	state.stack = stack = &firststack;
@@ -693,8 +694,15 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
 			continue;
 		}
 
+		/*
+		 * During gistbuild we cannot rely on lsn-nsn comparison,
+		 * since all pages have the same GistBuildLSN value in these fields.
+		 * So use special flag stack->retry_from_parent to detect the fact that
+		 * we need to go back to parent and rechoose the child.
+		 */
 		if (stack->blkno != GIST_ROOT_BLKNO &&
-			stack->parent->lsn < GistPageGetNSN(stack->page))
+			((stack->parent->lsn < GistPageGetNSN(stack->page)) ||
+			 stack->retry_from_parent == true))
 		{
 			/*
 			 * Concurrent split detected. There's no guarantee that the
@@ -786,6 +794,7 @@ gistdoinsert(Relation r, IndexTuple itup, Size freespace,
 						UnlockReleaseBuffer(stack->buffer);
 						xlocked = false;
 						state.stack = stack = stack->parent;
+						stack->retry_from_parent = true;
 					}
 					continue;
 				}
diff --git a/src/include/access/gist_private.h b/src/include/access/gist_private.h
index 78e2e3f..361a7a8 100644
--- a/src/include/access/gist_private.h
+++ b/src/include/access/gist_private.h
@@ -215,6 +215,12 @@ typedef struct GISTInsertStack
 	 */
 	GistNSN		lsn;
 
+	/*
+	 * flag to use during gistbuild to recognize page split
+	 * (instead of lsn-nsn comparison)
+	 */
+	bool retry_from_parent;
+
 	/* offset of the downlink in the parent page, that points to this page */
 	OffsetNumber downlinkoffnum;
 
commit 53f4607b50e0516be05ef27d9369c89bdcb9d82b
Author: Anastasia <a.lubennik...@postgrespro.ru>
Date:   Tue Apr 9 18:56:32 2019 +0300

    test

diff --git a/contrib/intarray/Makefile b/contrib/intarray/Makefile
index 2505294..3535bb1 100644
--- a/contrib/intarray/Makefile
+++ b/contrib/intarray/Makefile
@@ -9,7 +9,7 @@ DATA = intarray--1.2.sql intarray--1.1--1.2.sql intarray--1.0--1.1.sql \
 	intarray--unpackaged--1.0.sql
 PGFILEDESC = "intarray - functions and operators for arrays of integers"
 
-REGRESS = _int
+REGRESS = _int _int2
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/intarray/expected/_int2.out b/contrib/intarray/expected/_int2.out
new file mode 100644
index 0000000..8346334
--- /dev/null
+++ b/contrib/intarray/expected/_int2.out
@@ -0,0 +1,47 @@
+CREATE TABLE test__int2( a int[] );
+\copy test__int2 from 'data/test__int.data'
+ANALYZE test__int2;
+SELECT count(*) from test__int2 WHERE a && '{23,50}';
+ count 
+-------
+   403
+(1 row)
+
+SELECT count(*) from test__int2 WHERE a @@ '23|50';
+ count 
+-------
+   403
+(1 row)
+
+CREATE INDEX text_idx2 on test__int2 using gist ( a gist__int_ops );
+SELECT count(*) from test__int2 WHERE a && '{23,50}';
+ count 
+-------
+   403
+(1 row)
+
+SELECT count(*) from test__int2 WHERE a @@ '23|50';
+ count 
+-------
+   403
+(1 row)
+
+SELECT a, t FROM (SELECT a::text, (SELECT COUNT (*) FROM test__int2 t2 WHERE t2.a = t1.a) t from test__int2 t1 WHERE a IS NOT NULL) s WHERE t = 0;
+ a | t 
+---+---
+(0 rows)
+
+select * from test__int2 where a = '{193,213,230,266,285,299}';
+             a             
+---------------------------
+ {193,213,230,266,285,299}
+(1 row)
+
+set enable_indexscan to off;
+set enable_bitmapscan to off;
+select * from test__int2 where a = '{193,213,230,266,285,299}';
+             a             
+---------------------------
+ {193,213,230,266,285,299}
+(1 row)
+
diff --git a/contrib/intarray/sql/_int2.sql b/contrib/intarray/sql/_int2.sql
new file mode 100644
index 0000000..d4adca7
--- /dev/null
+++ b/contrib/intarray/sql/_int2.sql
@@ -0,0 +1,20 @@
+CREATE TABLE test__int2( a int[] );
+\copy test__int2 from 'data/test__int.data'
+ANALYZE test__int2;
+
+SELECT count(*) from test__int2 WHERE a && '{23,50}';
+SELECT count(*) from test__int2 WHERE a @@ '23|50';
+
+CREATE INDEX text_idx2 on test__int2 using gist ( a gist__int_ops );
+
+SELECT count(*) from test__int2 WHERE a && '{23,50}';
+SELECT count(*) from test__int2 WHERE a @@ '23|50';
+
+SELECT a, t FROM (SELECT a::text, (SELECT COUNT (*) FROM test__int2 t2 WHERE t2.a = t1.a) t from test__int2 t1 WHERE a IS NOT NULL) s WHERE t = 0;
+
+select * from test__int2 where a = '{193,213,230,266,285,299}';
+
+set enable_indexscan to off;
+set enable_bitmapscan to off;
+
+select * from test__int2 where a = '{193,213,230,266,285,299}';
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 94b6ad6..2183b3a 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -509,6 +509,7 @@ gistchoose(Relation r, Page p, IndexTuple it,	/* it has compressed entry */
 			{
 				/* we didn't make the random choice yet for this old best */
 				keep_current_best = (random() <= (MAX_RANDOM_VALUE / 2)) ? 1 : 0;
+				keep_current_best = 0; /* FIXME only for debugging */
 			}
 			if (keep_current_best == 0)
 			{
@@ -531,6 +532,7 @@ gistchoose(Relation r, Page p, IndexTuple it,	/* it has compressed entry */
 			{
 				/* we didn't make the random choice yet for this old best */
 				keep_current_best = (random() <= (MAX_RANDOM_VALUE / 2)) ? 1 : 0;
+				keep_current_best = 0; /* FIXME only for debugging */
 			}
 			if (keep_current_best == 1)
 				break;

Reply via email to