On 2019-11-07 05:16, Michael Paquier wrote:
The current defaults of pg_basebackup have been thought so as the
backups taken have a good stability and so as monitoring is eased
thanks to --wal-method=stream and the use of replication slots.
Shouldn't the use of a least a temporary replication slot be mandatory
for the stability of the copy?  It seems to me that there is a good
argument for having a second process which streams WAL on top of the
main backup process, and just use a WAL receiver for that.
Is this something that the walreceiver should be doing independent of this
patch?
There could be an argument for switching a WAL receiver to use a
temporary replication slot by default.  Still, it seems to me that
this backup solution suffers from the same set of problems we have
spent years to fix with pg_basebackup with missing WAL files caused by
concurrent checkpoints removing things needed while the copy of the
main data folder and other tablespaces happens.

I looked into this. It seems trivial to make walsender create and use a temporary replication slot by default if no permanent replication slot is specified. This is basically the logic that pg_basebackup has but done server-side. See attached patch for a demonstration. Any reason not to do that?

--
Peter Eisentraut              http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 6ae76011ece6a5900cc06e2350b0ccb930eb41a0 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut <pe...@eisentraut.org>
Date: Sat, 9 Nov 2019 22:10:19 +0100
Subject: [PATCH] walsender uses a temporary replication slot by default

---
 src/backend/replication/walsender.c   | 9 +++++++--
 src/bin/pg_basebackup/pg_basebackup.c | 8 +++++++-
 2 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/src/backend/replication/walsender.c 
b/src/backend/replication/walsender.c
index 7f5671504f..c16455f3f0 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -563,6 +563,12 @@ StartReplication(StartReplicationCmd *cmd)
                                        
(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
                                         (errmsg("cannot use a logical 
replication slot for physical replication"))));
        }
+       else
+       {
+               char *slotname = psprintf("pg_walsender_%d", MyProcPid);
+
+               ReplicationSlotCreate(slotname, false, RS_TEMPORARY);
+       }
 
        /*
         * Select the timeline. If it was given explicitly by the client, use
@@ -703,8 +709,7 @@ StartReplication(StartReplicationCmd *cmd)
                Assert(streamingDoneSending && streamingDoneReceiving);
        }
 
-       if (cmd->slotname)
-               ReplicationSlotRelease();
+       ReplicationSlotRelease();
 
        /*
         * Copy is finished now. Send a single-row result set indicating the 
next
diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
b/src/bin/pg_basebackup/pg_basebackup.c
index a9d162a7da..687bd331dd 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -68,6 +68,11 @@ typedef struct TablespaceList
  */
 #define MINIMUM_VERSION_FOR_TEMP_SLOTS 100000
 
+/*
+ * Version 13 creates temporary slots automatically.
+ */
+#define MINIMUM_VERSION_FOR_AUTO_TEMP_SLOTS 130000
+
 /*
  * Different ways to include WAL
  */
@@ -569,7 +574,8 @@ StartLogStreamer(char *startpos, uint32 timeline, char 
*sysidentifier)
                         "pg_xlog" : "pg_wal");
 
        /* Temporary replication slots are only supported in 10 and newer */
-       if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_TEMP_SLOTS)
+       if (PQserverVersion(conn) < MINIMUM_VERSION_FOR_TEMP_SLOTS ||
+               PQserverVersion(conn) >= MINIMUM_VERSION_FOR_AUTO_TEMP_SLOTS)
                temp_replication_slot = false;
 
        /*
-- 
2.24.0

Reply via email to