On Mon, 14 Jul 2025 at 20:33, Fujii Masao <masao.fu...@oss.nttdata.com> wrote: > On 2025/07/14 17:08, Japin Li wrote: >> Hi all, >> I recently hit an error with our streaming replication setup: >> 2025-07-14 11:52:59.361 >> CST,"replicator","",728458,"10.9.9.74:35724",68747f1b.b1d8a,1,"START_REPLICATION",2025-07-14 >> 11:52:59 CST,3/0,0,ERROR,58P01,"requested WAL segment >> 00000001000000000000000C has already been >> removed",,,,,,"START_REPLICATION 0/C000000 TIMELINE >> 1",,,"standby","walsender",,0 >> It appears the requested WAL segment 00000001000000000000000C had >> already been >> archived, and I confirmed its presence in the archive directory. However, >> when >> the standby tried to request this file, the primary only searched for it in >> pg_wal and didn't check the archive directory. I had to manually copy the >> segment into pg_wal to get streaming replication working again. >> My question is: Can we make the primary automatically search the >> archive if >> restore_command is set? >> I found that Fujii Masao also requested this feature [1], but it >> seems there >> wasn't a consensus. > > Yeah, I still like this idea. It's useful, for example, when we want to > temporarily retain WAL files, such as during planned standby maintenance, > to avoid "requested WAL segment ... removed." error. > > Using a replication slot is one way to retain WAL files in pg_wal, > but it requires the pg_wal directory to be large enough to hold all > WAL generated during that time, which isn't always practical. >
Agreed. Here is a patch that fixes this. -- Regards, Japin Li
>From 9df3700bf0152c44e232755137c4681fd2c72e50 Mon Sep 17 00:00:00 2001 From: Japin Li <japi...@hotmail.com> Date: Tue, 15 Jul 2025 13:58:53 +0800 Subject: [PATCH] Allow the walsender to retrieve WALs from the archive --- src/backend/access/transam/xlogarchive.c | 4 ++-- src/backend/replication/walsender.c | 10 ++++++++++ 2 files changed, 12 insertions(+), 2 deletions(-) diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c index 1ef1713c91a..fe932c11f44 100644 --- a/src/backend/access/transam/xlogarchive.c +++ b/src/backend/access/transam/xlogarchive.c @@ -66,9 +66,9 @@ RestoreArchivedFile(char *path, const char *xlogfname, /* * Ignore restore_command when not in archive recovery (meaning we are in - * crash recovery). + * crash recovery) and non-walsender processes. */ - if (!ArchiveRecoveryRequested) + if (!ArchiveRecoveryRequested && !am_walsender) goto not_available; /* In standby mode, restore_command might not be supplied */ diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c index 28b8591efa5..438b5d27a32 100644 --- a/src/backend/replication/walsender.c +++ b/src/backend/replication/walsender.c @@ -53,6 +53,7 @@ #include "access/transam.h" #include "access/xact.h" #include "access/xlog_internal.h" +#include "access/xlogarchive.h" #include "access/xlogreader.h" #include "access/xlogrecovery.h" #include "access/xlogutils.h" @@ -3068,6 +3069,15 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo, int save_errno = errno; XLogFileName(xlogfname, *tli_p, nextSegNo, wal_segment_size); + + /* Restore WALs from archive if not found in XLOGDIR. */ + if (RestoreArchivedFile(path, xlogfname, xlogfname, wal_segment_size, false)) + { + state->seg.ws_file = BasicOpenFile(path, O_RDONLY | PG_BINARY); + if (state->seg.ws_file >= 0) + return; + } + errno = save_errno; ereport(ERROR, (errcode_for_file_access(), -- 2.43.0