I'd like to look into the WAL segments related to the failure. Mmm... With the patch, xlogreader->abortedRecPtr is valid only and always when the last read failed record was an aborted contrec. If recovery ends here the first insereted record is an "aborted contrec" record. I still see it as the only chance that an aborted contrecord is followed by a non-"aborted contrec" record is that recovery somehow fetches two consecutive WAL segments that are inconsistent at the boundary.
I found the reason that the TAP test doesn't respond to the first proposed patch (the below). - if (!StandbyMode && + if (!StandbyModeRequested && !XLogRecPtrIsInvalid(xlogreader->abortedRecPtr)) The cause was that I disabled standby-mode in the test. The change affects only while standby mode is on, which was to make the test reliable and simpler. The first attached detects the same error (in a somwhat maybe-unstable way) and responds to the fix above, and also responds to the aborted_contrec_reset_3.patch. So, aborted_contrec_reset_3 looks closer to the issue than before. Would you mind trying the second attached to abtain detailed log on your testing environment? With the patch, the modified TAP test yields the log lines like below. 2022-06-28 15:49:20.661 JST [165472] LOG: ### [A] @0/1FFD338: abort=(0/1FFD338)0/0, miss=(0/2000000)0/0, SbyMode=0, SbyModeReq=1 ... 2022-06-28 15:49:20.681 JST [165472] LOG: ### [F] @0/2094610: abort=(0/0)0/1FFD338, miss=(0/0)0/2000000, SbyMode=1, SbyModeReq=1 ... 2022-06-28 15:49:20.767 JST [165472] LOG: ### [S] @0/2094610: abort=(0/0)0/1FFD338, miss=(0/0)0/2000000, SbyMode=0, SbyModeReq=1 ... 2022-06-28 15:49:20.777 JST [165470] PANIC: xlog flush request 0/2094610 is not satisfied --- flushed only to 0/2000088 In this example, abortedRecPtr is set at the first line and recovery continued to 2094610 but abortedRecPtr is not reset then PANICed. ([A] means aborted contrec falure. [F] and [S] are failed and succeeded reads respectively.) regards. -- Kyotaro Horiguchi NTT Open Source Software Center
diff --git a/src/test/recovery/t/026_overwrite_contrecord.pl b/src/test/recovery/t/026_overwrite_contrecord.pl index 78feccd9aa..043eff86ec 100644 --- a/src/test/recovery/t/026_overwrite_contrecord.pl +++ b/src/test/recovery/t/026_overwrite_contrecord.pl @@ -59,6 +59,7 @@ $node->safe_psql('postgres', qq{SELECT pg_logical_emit_message(true, 'test 026', repeat('xyzxz', 123456))} ); #$node->safe_psql('postgres', qq{create table foo ()}); +my $endlsn = $node->safe_psql('postgres', 'SELECT pg_current_wal_insert_lsn()'); my $endfile = $node->safe_psql('postgres', 'SELECT pg_walfile_name(pg_current_wal_insert_lsn())'); ok($initfile ne $endfile, "$initfile differs from $endfile"); @@ -68,9 +69,8 @@ ok($initfile ne $endfile, "$initfile differs from $endfile"); # contents $node->stop('immediate'); -unlink $node->basedir . "/pgdata/pg_wal/$endfile" - or die "could not unlink " . $node->basedir . "/pgdata/pg_wal/$endfile: $!"; - +system(sprintf("mv %s/pgdata/pg_wal/$endfile %s/archives/", $node->basedir, $node->basedir)); + # OK, create a standby at this spot. $node->backup_fs_cold('backup'); my $node_standby = PostgreSQL::Test::Cluster->new('standby'); @@ -106,4 +106,17 @@ $node_standby->promote; $node->stop; $node_standby->stop; +my $node_primary_2 = PostgreSQL::Test::Cluster->new('primary_2'); +$node_primary_2->init_from_backup($node, 'backup', has_restoring => 1, standby => 1); + +$node_primary_2->append_conf( + 'postgresql.conf', qq( +log_min_messages = debug1 +)); +$node_primary_2->start; +$node_primary_2->poll_query_until('postgres', + "SELECT '$endlsn'::pg_lsn <= pg_last_wal_replay_lsn()"); +$node_primary_2->promote; +$node_primary_2->poll_query_until('postgres', 'SELECT NOT pg_is_in_recovery()'); +$node_primary_2->safe_psql('postgres', 'CREATE TABLE bar AS select generate_series(0, 99999) a'); done_testing();
diff --git a/src/backend/access/transam/xlogrecovery.c b/src/backend/access/transam/xlogrecovery.c index 6eba626420..cd02e6d8dd 100644 --- a/src/backend/access/transam/xlogrecovery.c +++ b/src/backend/access/transam/xlogrecovery.c @@ -2967,8 +2967,17 @@ ReadRecord(XLogPrefetcher *xlogprefetcher, int emode, for (;;) { char *errormsg; + static int log = 2; record = XLogPrefetcherReadRecord(xlogprefetcher, &errormsg); + + if (!record) + log = 2; + else if (log > 0) + log--; + + if (log) + elog(LOG, "### [%s] @%X/%X: abort=(%X/%X)%X/%X, miss=(%X/%X)%X/%X, SbyMode=%d, SbyModeReq=%d", record ? "S": (XLogRecPtrIsInvalid(xlogreader->abortedRecPtr) ? "F" : "A"), LSN_FORMAT_ARGS(xlogreader->EndRecPtr), LSN_FORMAT_ARGS(xlogreader->abortedRecPtr), LSN_FORMAT_ARGS(abortedRecPtr), LSN_FORMAT_ARGS(xlogreader->missingContrecPtr), LSN_FORMAT_ARGS(missingContrecPtr), StandbyMode, StandbyModeRequested); if (record == NULL) { /*