05.04.2019 18:01, Tom Lane writes:
Andrew Dunstan <andrew.duns...@2ndquadrant.com> writes:
On Fri, Apr 5, 2019 at 2:02 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
This is a strange failure:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=loach&dt=2019-04-05%2005%3A15%3A00
[ wrong answers from queries using a GIST index ]
There are a couple of other recent instances of this failure, on
francolin and whelk.
Yeah.  Given three failures in a couple of days, we can reasonably
guess that the problem was introduced within a day or two prior to
the first one.  Looking at what's touched GIST in that time frame,
suspicion has to fall heavily on 9155580fd5fc2a0cbb23376dfca7cd21f59c2c7b.

If I had to bet, I'd bet that there's something wrong with the
machinations described in the commit message:
For GiST, the LSN-NSN interlock makes this a little tricky. All pages must
     be marked with a valid (i.e. non-zero) LSN, so that the parent-child
     LSN-NSN interlock works correctly. We now use magic value 1 for that during
     index build. Change the fake LSN counter to begin from 1000, so that 1 is
     safely smaller than any real or fake LSN. 2 would've been enough for our
     purposes, but let's reserve a bigger range, in case we need more special
     values in the future.

I'll go add this as an open issue.

                        regards, tom lane


Hi,
I've already noticed the same failure in our company buildfarm and started the research.

You are right, it's the " Generate less WAL during GiST, GIN and SP-GiST index build. " patch to blame. Because of using the GistBuildLSN some pages are not linked correctly, so index scan cannot find some entries, while seqscan finds them.

In attachment, you can find patch with a test that allows to reproduce the bug not randomly, but on every run.
Now I'm trying to find a way to fix the issue.

--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

commit b233d40c1bd1b270ec0ea1d52887794320a8f595
Author: Anastasia <a.lubennik...@postgrespro.ru>
Date:   Fri Apr 5 19:25:48 2019 +0300

    Test to reproduce gist index problem caused by patch Generate less WAL during GiST, GIN and SP-GiST index build

diff --git a/contrib/intarray/Makefile b/contrib/intarray/Makefile
index 2505294..3535bb1 100644
--- a/contrib/intarray/Makefile
+++ b/contrib/intarray/Makefile
@@ -9,7 +9,7 @@ DATA = intarray--1.2.sql intarray--1.1--1.2.sql intarray--1.0--1.1.sql \
 	intarray--unpackaged--1.0.sql
 PGFILEDESC = "intarray - functions and operators for arrays of integers"
 
-REGRESS = _int
+REGRESS = _int _int2
 
 ifdef USE_PGXS
 PG_CONFIG = pg_config
diff --git a/contrib/intarray/expected/_int2.out b/contrib/intarray/expected/_int2.out
new file mode 100644
index 0000000..e88fd34
--- /dev/null
+++ b/contrib/intarray/expected/_int2.out
@@ -0,0 +1,48 @@
+CREATE TABLE test__int2( a int[] );
+\copy test__int2 from 'data/test__int.data'
+ANALYZE test__int2;
+SELECT count(*) from test__int2 WHERE a && '{23,50}';
+ count 
+-------
+   403
+(1 row)
+
+SELECT count(*) from test__int2 WHERE a @@ '23|50';
+ count 
+-------
+   403
+(1 row)
+
+CREATE INDEX text_idx2 on test__int2 using gist ( a gist__int_ops );
+SELECT count(*) from test__int2 WHERE a && '{23,50}';
+ count 
+-------
+   403
+(1 row)
+
+SELECT count(*) from test__int2 WHERE a @@ '23|50';
+ count 
+-------
+   403
+(1 row)
+
+SELECT a, t FROM (SELECT a::text, (SELECT COUNT (*) FROM test__int2 t2 WHERE t2.a = t1.a) t from test__int2 t1 WHERE a IS NOT NULL) s WHERE t = 0;
+ a | t 
+---+---
+(0 rows)
+
+
+select * from test__int2 where a = '{193,213,230,266,285,299}';
+             a             
+---------------------------
+ {193,213,230,266,285,299}
+(1 row)
+
+set enable_indexscan to off;
+set enable_bitmapscan to off;
+select * from test__int2 where a = '{193,213,230,266,285,299}';
+             a             
+---------------------------
+ {193,213,230,266,285,299}
+(1 row)
+
diff --git a/contrib/intarray/sql/_int2.sql b/contrib/intarray/sql/_int2.sql
new file mode 100644
index 0000000..d4adca7
--- /dev/null
+++ b/contrib/intarray/sql/_int2.sql
@@ -0,0 +1,20 @@
+CREATE TABLE test__int2( a int[] );
+\copy test__int2 from 'data/test__int.data'
+ANALYZE test__int2;
+
+SELECT count(*) from test__int2 WHERE a && '{23,50}';
+SELECT count(*) from test__int2 WHERE a @@ '23|50';
+
+CREATE INDEX text_idx2 on test__int2 using gist ( a gist__int_ops );
+
+SELECT count(*) from test__int2 WHERE a && '{23,50}';
+SELECT count(*) from test__int2 WHERE a @@ '23|50';
+
+SELECT a, t FROM (SELECT a::text, (SELECT COUNT (*) FROM test__int2 t2 WHERE t2.a = t1.a) t from test__int2 t1 WHERE a IS NOT NULL) s WHERE t = 0;
+
+select * from test__int2 where a = '{193,213,230,266,285,299}';
+
+set enable_indexscan to off;
+set enable_bitmapscan to off;
+
+select * from test__int2 where a = '{193,213,230,266,285,299}';
diff --git a/src/backend/access/gist/gistutil.c b/src/backend/access/gist/gistutil.c
index 94b6ad6..2183b3a 100644
--- a/src/backend/access/gist/gistutil.c
+++ b/src/backend/access/gist/gistutil.c
@@ -509,6 +509,7 @@ gistchoose(Relation r, Page p, IndexTuple it,	/* it has compressed entry */
 			{
 				/* we didn't make the random choice yet for this old best */
 				keep_current_best = (random() <= (MAX_RANDOM_VALUE / 2)) ? 1 : 0;
+				keep_current_best = 0; /* FIXME only for debugging */
 			}
 			if (keep_current_best == 0)
 			{
@@ -531,6 +532,7 @@ gistchoose(Relation r, Page p, IndexTuple it,	/* it has compressed entry */
 			{
 				/* we didn't make the random choice yet for this old best */
 				keep_current_best = (random() <= (MAX_RANDOM_VALUE / 2)) ? 1 : 0;
+				keep_current_best = 0; /* FIXME only for debugging */
 			}
 			if (keep_current_best == 1)
 				break;

Reply via email to