Re: Changing the state of data checksums in a running cluster

Tomas Vondra Thu, 28 Aug 2025 09:11:46 -0700

Hi,

I spent a bit more time fixing the TAP test. The attached patch makes it
"work" for me (or I think it should, in principle). I'm not saying it's
the best way to do stuff.


With the patch applied, I tried running it, and I got a failure when
running pg_checksums. There's a log snippet describing the issue, but
AFAICS it's happening like this:

1) checksums are disabled
2) flip_data_checksums gets called
3) both clusters go through 'inprogress-on' and 'on' states
4) primary gets shutdown in 'immediate' mode
5) standby gets shutdown in 'fast' mode
6) we try to validate checksums on the standby, but control file still
says checksums=inprogress-on

This seems like a bug to me - AFAICS the expectation is that after fast
shutdown, we don't forget the checksum state. Or is that expected? In
that case the TAP test probably needs to check the control file, instead
of relying on the perl variable $data_checksum_state. Or maybe it should
check that the control file has the correct / expected state?

FWIW I don't think the primary shutdown matters. I've seen multiple of
these failures, and it happens even without primary shutdown. But the
standby "fast" shutdown is always there.

But this also shows a limitation of the TAP test - it never triggers the
shutdowns while flipping the checksums (in flip_data_checksums). I think
that's something worth testing.

regards

-- 
Tomas Vondra

diff --git a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
index b33ca6e0c26..5cee6d4a6b5 100644
--- a/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
+++ b/src/test/modules/test_checksums/t/006_concurrent_pgbench.pl
@@ -55,7 +55,7 @@ if ($ENV{enable_injection_points} ne 'yes')
 # whether to turn things off during testing.
 sub cointoss
 {
-	return int(rand(2) == 1);
+	return int(rand() < 0.5);
 }
 
 # Helper for injecting random sleeps here and there in the testrun. The sleep
@@ -74,7 +74,7 @@ sub background_ro_pgbench
 	my ($port, $stdin, $stdout, $stderr) = @_;
 
 	my $pgbench_primary = IPC::Run::start(
-		[ 'pgbench', '-p', $port, '-S', '-T', '600', '-c', '10', 'postgres' ],
+		[ 'pgbench', '-n', '-p', $port, '-S', '-T', '600', '-c', '10', 'postgres' ],
 		'<' => \$stdin,
 		'>' => \$stdout,
 		'2>' => \$stderr,
@@ -224,6 +224,9 @@ background_rw_pgbench(
 	$node_primary->port, $pgb_primary_stdin,
 	$pgb_primary_stdout, $pgb_primary_stderr);
 
+my $primary_shutdown_clean = 0;
+my $standby_shutdown_clean = 0;
+
 # Main test suite. This loop will start a pgbench run on the cluster and while
 # that's running flip the state of data checksums concurrently. It will then
 # randomly restart thec cluster (in fast or immediate) mode and then check for
@@ -246,9 +249,11 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 		$node_primary_loglocation = -s $node_primary->logfile;
 
 		# If data checksums are enabled, take the opportunity to verify them
-		# while the cluster is offline
+		# while the cluster is offline (but only if stopped in a clean way,
+		# not after immediate shutdown)
 		$node_primary->checksum_verify_offline()
-		  unless $data_checksum_state eq 'off';
+		  unless $data_checksum_state eq 'off' or !$primary_shutdown_clean;
+
 		random_sleep();
 		$node_primary->start;
 		# Start a pgbench in the background against the primary
@@ -270,9 +275,11 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 		$node_standby_1_loglocation = -s $node_standby_1->logfile;
 
 		# If data checksums are enabled, take the opportunity to verify them
-		# while the cluster is offline
+		# while the cluster is offline (but only if stopped in a clean way,
+		# not after immediate shutdown)
 		$node_standby_1->checksum_verify_offline()
-		  unless $data_checksum_state eq 'off';
+		  unless $data_checksum_state eq 'off' or !$standby_shutdown_clean;
+
 		random_sleep();
 		$node_standby_1->start;
 		# Start a select-only pgbench in the background on the standby
@@ -287,13 +294,41 @@ for (my $i = 0; $i < $TEST_ITERATIONS; $i++)
 	my $result = $node_primary->safe_psql('postgres',
 		"SELECT count(*) FROM t WHERE a > 1");
 	is($result, '100000', 'ensure data pages can be read back on primary');
+
 	random_sleep();
+
 	$node_primary->wait_for_catchup($node_standby_1, 'write');
 
-	# Potentially powercycle the cluster
-	$node_primary->stop($stop_modes[ int(rand(100)) ]) if cointoss();
 	random_sleep();
-	$node_standby_1->stop($stop_modes[ int(rand(100)) ]) if cointoss();
+
+	# Potentially powercycle the cluster (the nodes independently)
+	# XXX should maybe try stopping nodes in the opposite order too?
+	if (cointoss())
+	{
+		my $mode = $stop_modes[ int(rand(100)) ];
+		$node_primary->stop($mode);
+		$primary_shutdown_clean = ($mode eq 'fast');
+	}
+
+	random_sleep();
+
+	if (cointoss())
+	{
+		my $mode = $stop_modes[ int(rand(100)) ];
+		$node_standby_1->stop($mode);
+		$standby_shutdown_clean = ($mode eq 'fast');
+	}
+}
+
+# make sure the nodes are running
+if (!$node_primary->is_alive)
+{
+	$node_primary->start;
+}
+
+if (!$node_standby_1->is_alive)
+{
+        $node_standby_1->start;
 }
 
 # Testrun is over, ensure that data reads back as expected and perform a final

# Postmaster PID for node "standby_1" is 27122
[17:38:45.503](0.673s) ok 104 - ensure checksums are set to off
[17:38:45.513](0.011s) ok 105 - ensure checksums are set to off
[17:38:45.537](0.024s) ok 106 - ensure data checksums are transitioned to 
inprogress-on
Waiting for replication conn standby_1's replay_lsn to pass 6/8FCAB730 on main
done
[17:38:46.574](1.037s) ok 107 - ensure standby has absorbed the inprogress-on 
barrier
[17:38:47.585](1.011s) ok 108 - ensure checksums are on, or in progress, on 
standby_1
[17:39:03.705](16.119s) ok 109 - ensure data checksums are transitioned to on
[17:39:03.716](0.011s) ok 110 - ensure data checksums are transitioned to on
[17:39:03.784](0.068s) ok 111 - ensure data pages can be read back on primary
Waiting for replication conn standby_1's write_lsn to pass 7/2B4BD0F0 on main
done
### Stopping node "main" using mode immediate
# Running: pg_ctl --pgdata 
/home/tomas/postgres/src/test/modules/test_checksums/tmp_check/t_006_concurrent_pgbench_main_data/pgdata
 --mode immediate stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "main"
### Stopping node "standby_1" using mode fast
# Running: pg_ctl --pgdata 
/home/tomas/postgres/src/test/modules/test_checksums/tmp_check/t_006_concurrent_pgbench_standby_1_data/pgdata
 --mode fast stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "standby_1"
# Running: pg_isready --timeout 180 --host /tmp/800zPudzD2 --port 30082
/tmp/800zPudzD2:30082 - no response
[17:39:06.021](2.237s) ok 112 - no checksum validation errors in primary log
### Starting node "main"
# Running: pg_ctl --wait --pgdata 
/home/tomas/postgres/src/test/modules/test_checksums/tmp_check/t_006_concurrent_pgbench_main_data/pgdata
 --log 
/home/tomas/postgres/src/test/modules/test_checksums/tmp_check/log/006_concurrent_pgbench_main.log
 --options --cluster-name=main start
waiting for server to start.... done
server started
# Postmaster PID for node "main" is 27488
# Running: pg_isready --timeout 180 --host /tmp/800zPudzD2 --port 30083
/tmp/800zPudzD2:30083 - no response
[17:39:06.132](0.111s) ok 113 - no checksum validation errors in standby_1 log
# Running: pg_checksums -D 
/home/tomas/postgres/src/test/modules/test_checksums/tmp_check/t_006_concurrent_pgbench_standby_1_data/pgdata
 -c
pg_checksums: error: data checksums are not enabled in cluster
[17:39:06.134](0.002s) Bail out!  command "pg_checksums -D 
/home/tomas/postgres/src/test/modules/test_checksums/tmp_check/t_006_concurrent_pgbench_standby_1_data/pgdata
 -c" exited with value 1
# Postmaster PID for node "main" is 27488
### Stopping node "main" using mode immediate
# Running: pg_ctl --pgdata 
/home/tomas/postgres/src/test/modules/test_checksums/tmp_check/t_006_concurrent_pgbench_main_data/pgdata
 --mode immediate stop
waiting for server to shut down.... done
server stopped
# No postmaster PID for node "main"
# No postmaster PID for node "standby_1"

Re: Changing the state of data checksums in a running cluster

Reply via email to