On 2021/01/14 14:23, Tom Lane wrote:
Fujii Masao <[email protected]> writes:
On 2021/01/14 13:59, Michael Paquier wrote:
florican is telling that this test has some stability problems:
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=florican&dt=2021-01-14%2003%3A55%3A45

My guess is that the requested WAL file was removed unfortunately by
checkpoint because no replication slot is used and wal_keep_size is not set.
So easy fix is to set wal_keep_size to 512MB or other in that test. Thought?

florican did pass this test on the v13 branch, so I agree it's probably
a timing issue not any deeper bug.  Your theory seems plausible.

Thanks for the check!

So, barring any objection, I will push the attached patch that sets
wal_keep_size in the test.

BTW, I included the URL to Michael's report [1] in the commit log. But this
URL doesn't seem to work fine maybe because <message-id> part includes
a slash character.

[1]
https://postgr.es/m/X//[email protected]

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION
From 2ed9eed25bfd7b6e22eade23e2c602af7123c731 Mon Sep 17 00:00:00 2001
From: Fujii Masao <[email protected]>
Date: Thu, 14 Jan 2021 14:37:01 +0900
Subject: [PATCH] Stabilize timeline switch regression test.

Commit fef5b47f6b added the regression test to check whether a standby is
able to follow a primary on a newer timeline when WAL archiving is enabled.
But the buildfarm member florican reported that this test failed because
the requested WAL segment was removed and replication failed. This is a
timing issue. Since neither replication slot is used nor wal_keep_size is set
in the test, checkpoint could remove the WAL segment that's still necessary
for replication.

This commit stabilizes the test by setting wal_keep_size.

Back-patch to v13 where the regression test that this commit stabilizes
was added.

Author: Fujii Masao
Discussion: https://postgr.es/m/X//[email protected]
---
 src/test/recovery/t/004_timeline_switch.pl | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/test/recovery/t/004_timeline_switch.pl 
b/src/test/recovery/t/004_timeline_switch.pl
index edadab790f..91a63f4e58 100644
--- a/src/test/recovery/t/004_timeline_switch.pl
+++ b/src/test/recovery/t/004_timeline_switch.pl
@@ -75,6 +75,10 @@ is($result, qq(2000), 'check content of standby 2');
 # Initialize master node
 my $node_master_2 = get_new_node('master_2');
 $node_master_2->init(allows_streaming => 1, has_archiving => 1);
+$node_master_2->append_conf(
+       'postgresql.conf', qq(
+wal_keep_size = 512MB
+));
 $node_master_2->start;
 
 # Take backup
-- 
2.27.0

From c7680a8a59218b78a71b5dc1a91d935576369735 Mon Sep 17 00:00:00 2001
From: Fujii Masao <[email protected]>
Date: Thu, 14 Jan 2021 14:37:01 +0900
Subject: [PATCH] Stabilize timeline switch regression test.

Commit fef5b47f6b added the regression test to check whether a standby is
able to follow a primary on a newer timeline when WAL archiving is enabled.
But the buildfarm member florican reported that this test failed because
the requested WAL segment was removed and replication failed. This is a
timing issue. Since neither replication slot is used nor wal_keep_size is set
in the test, checkpoint could remove the WAL segment that's still necessary
for replication.

This commit stabilizes the test by setting wal_keep_size.

Back-patch to v13 where the regression test that this commit stabilizes
was added.

Author: Fujii Masao
Discussion: https://postgr.es/m/X//[email protected]
---
 src/test/recovery/t/004_timeline_switch.pl | 4 ++++
 1 file changed, 4 insertions(+)

diff --git a/src/test/recovery/t/004_timeline_switch.pl 
b/src/test/recovery/t/004_timeline_switch.pl
index 8dad044db4..c8dbd8f9df 100644
--- a/src/test/recovery/t/004_timeline_switch.pl
+++ b/src/test/recovery/t/004_timeline_switch.pl
@@ -75,6 +75,10 @@ is($result, qq(2000), 'check content of standby 2');
 # Initialize primary node
 my $node_primary_2 = get_new_node('primary_2');
 $node_primary_2->init(allows_streaming => 1, has_archiving => 1);
+$node_primary_2->append_conf(
+       'postgresql.conf', qq(
+wal_keep_size = 512MB
+));
 $node_primary_2->start;
 
 # Take backup
-- 
2.27.0

Reply via email to