On Mon, 14 Jul 2025 at 20:33, Fujii Masao <masao.fu...@oss.nttdata.com> wrote:
> On 2025/07/14 17:08, Japin Li wrote:
>> Hi all,
>> I recently hit an error with our streaming replication setup:
>>    2025-07-14 11:52:59.361
>> CST,"replicator","",728458,"10.9.9.74:35724",68747f1b.b1d8a,1,"START_REPLICATION",2025-07-14
>> 11:52:59 CST,3/0,0,ERROR,58P01,"requested WAL segment
>> 00000001000000000000000C has already been
>> removed",,,,,,"START_REPLICATION 0/C000000 TIMELINE
>> 1",,,"standby","walsender",,0
>> It appears the requested WAL segment 00000001000000000000000C had
>> already been
>> archived, and I confirmed its presence in the archive directory. However, 
>> when
>> the standby tried to request this file, the primary only searched for it in
>> pg_wal and didn't check the archive directory. I had to manually copy the
>> segment into pg_wal to get streaming replication working again.
>> My question is: Can we make the primary automatically search the
>> archive if
>> restore_command is set?
>> I found that Fujii Masao also requested this feature [1], but it
>> seems there
>> wasn't a consensus.
>
> Yeah, I still like this idea. It's useful, for example, when we want to
> temporarily retain WAL files, such as during planned standby maintenance,
> to avoid "requested WAL segment ... removed." error.
>
> Using a replication slot is one way to retain WAL files in pg_wal,
> but it requires the pg_wal directory to be large enough to hold all
> WAL generated during that time, which isn't always practical.
>

Agreed.  Here is a patch that fixes this.

-- 
Regards,
Japin Li
>From 9df3700bf0152c44e232755137c4681fd2c72e50 Mon Sep 17 00:00:00 2001
From: Japin Li <japi...@hotmail.com>
Date: Tue, 15 Jul 2025 13:58:53 +0800
Subject: [PATCH] Allow the walsender to retrieve WALs from the archive

---
 src/backend/access/transam/xlogarchive.c |  4 ++--
 src/backend/replication/walsender.c      | 10 ++++++++++
 2 files changed, 12 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index 1ef1713c91a..fe932c11f44 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -66,9 +66,9 @@ RestoreArchivedFile(char *path, const char *xlogfname,
 
 	/*
 	 * Ignore restore_command when not in archive recovery (meaning we are in
-	 * crash recovery).
+	 * crash recovery) and non-walsender processes.
 	 */
-	if (!ArchiveRecoveryRequested)
+	if (!ArchiveRecoveryRequested && !am_walsender)
 		goto not_available;
 
 	/* In standby mode, restore_command might not be supplied */
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 28b8591efa5..438b5d27a32 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -53,6 +53,7 @@
 #include "access/transam.h"
 #include "access/xact.h"
 #include "access/xlog_internal.h"
+#include "access/xlogarchive.h"
 #include "access/xlogreader.h"
 #include "access/xlogrecovery.h"
 #include "access/xlogutils.h"
@@ -3068,6 +3069,15 @@ WalSndSegmentOpen(XLogReaderState *state, XLogSegNo nextSegNo,
 		int			save_errno = errno;
 
 		XLogFileName(xlogfname, *tli_p, nextSegNo, wal_segment_size);
+
+		/* Restore WALs from archive if not found in XLOGDIR. */
+		if (RestoreArchivedFile(path, xlogfname, xlogfname, wal_segment_size, false))
+		{
+			state->seg.ws_file = BasicOpenFile(path, O_RDONLY | PG_BINARY);
+			if (state->seg.ws_file >= 0)
+				return;
+		}
+
 		errno = save_errno;
 		ereport(ERROR,
 				(errcode_for_file_access(),
-- 
2.43.0

Reply via email to