Several BF animals have intermittently shown this regression diff:
diff -U3
/home/bf/bf-build/culicidae/HEAD/pgsql/contrib/postgres_fdw/expected/postgres_fdw.out
/home/bf/bf-build/culicidae/HEAD/pgsql.build/testrun/postgres_fdw-running/regress/results/postgres_fdw.out
---
/home/bf/bf-build/culicidae/HEAD/pgsql/contrib/postgres_fdw/expected/postgres_fdw.out
2025-12-29 19:48:22.661603936 +0100
+++
/home/bf/bf-build/culicidae/HEAD/pgsql.build/testrun/postgres_fdw-running/regress/results/postgres_fdw.out
2026-02-10 00:31:31.856460156 +0100
@@ -6519,6 +6519,7 @@
UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;
c1 | c2 | c3 | c4 | c5 | c6 | c7 | c8
------+----+-----+----+----+----+------------+----
+ 2010 | 0 | bar | | | | ft2 |
2001 | 1 | bar | | | | ft2 |
2002 | 2 | bar | | | | ft2 |
2003 | 3 | bar | | | | ft2 |
@@ -6528,7 +6529,6 @@
2007 | 7 | bar | | | | ft2 |
2008 | 8 | bar | | | | ft2 |
2009 | 9 | bar | | | | ft2 |
- 2010 | 0 | bar | | | | ft2 |
(10 rows)
EXPLAIN (verbose, costs off)
The above is from culicidae [1], and greenfly has shown it a few times
[2], and here's one from scorpion [3], and crake on a back branch [4],
and there are a few more in the past 90 days.
It's pretty clear what is happening: the rows we are looking at are
being returned by a seqscan, and they were just inserted a few lines
above into a table that has been modified multiple times already.
So the test is reliant on them being inserted in sequence at the
end of the table, yet sometimes the last row is going into free
space someplace earlier.
One's first instinct is to blame autovacuum, but the test already
goes out of its way to disable that:
-- Disable autovacuum for these tables to avoid unexpected effects of that
ALTER TABLE "S 1"."T 1" SET (autovacuum_enabled = 'false');
After experimenting for awhile I think I have (part of) the answer.
All of the failing animals are using meson, which means that this
"installcheck" test case is probably running in parallel with other
test scripts in other databases in the same cluster. I've not
reproduced the exact symptom seen in the buildfarm, but I can easily
make the postgres_fdw test put these rows at different ctids if
I leave a transaction sitting open in a different database. So I
think we are seeing some effect of opportunistic page pruning behaving
differently depending on whether there is a concurrent transaction.
It's not clear to me that it's worth running this to ground in any
more detail than that. The behavior is not wrong; it's the test's
fault to assume that these rows will be returned in a deterministic
order. So I think the right fix is to adjust the test query,
along the lines of
-UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *;
+WITH cte AS (
+ UPDATE ft2 SET c3 = 'bar' WHERE postgres_fdw_abs(c1) > 2000 RETURNING *
+) SELECT * FROM cte ORDER BY c1;
Thoughts, better ideas?
regards, tom lane
[1]
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=culicidae&dt=2026-02-09%2023%3A21%3A25
[2]
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=greenfly&dt=2026-01-20%2004%3A32%3A29
[3]
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=scorpion&dt=2025-12-09%2015%3A23%3A53
[4]
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake&dt=2025-12-04%2017%3A50%3A38