On 2014-05-16 17:02:33 -0400, Steve Singer wrote:
> >I don't think that's going to cut it though. The creation can take
> >longer than whatever wal_sender_timeout is set to (when there's lots of
> >longrunning transactions). I think checking whether last_reply_timestamp
> >= 0 during timeout checking is more robust.
> That makes sense.
> A patch that does that is attached.
I've attached a heavily revised version of that patch. Didn't touch all
the necessary places...
Survives creation of logical replication slots under 'intensive
pressure', with a wal_sender_timeout=10ms.
Thanks for the bugreport!
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
>From 324d74e16dd8ee2a0fa977d92a269fd43a746196 Mon Sep 17 00:00:00 2001
From: Andres Freund <and...@anarazel.de>
Date: Sat, 17 May 2014 01:22:01 +0200
Subject: [PATCH] Don't pay heed to wal_sender_timeout while creating a
Sometimes CREATE_REPLICATION_SLOT ... LOGICAL ... needs to wait for
futher WAL using WalSndWaitForWal(). That used to always respect
wal_sender_timeout and kill the session when waiting long enough
because no feedback/ping messages can be sent while the slot is still
Add the notion that last_reply_timestamp = 0 means that the walsender
currently doesn't need timeout processing and add checks for it in a
couple of places.
Bugreport and initial patch by Steve Singer, revised by Andres Freund.
src/backend/replication/walsender.c | 25 +++++++++++++++++++++----
1 file changed, 21 insertions(+), 4 deletions(-)
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 5c11d68..90394ce 100644
@@ -148,9 +148,10 @@ static StringInfoData reply_message;
static StringInfoData tmpbuf;
- * Timestamp of the last receipt of the reply from the standby.
+ * Timestamp of the last receipt of the reply from the standby. Set to 0 if a
+ * the process currently shouldn't be killed by wal_sender_timeout.
-static TimestampTz last_reply_timestamp;
+static TimestampTz last_reply_timestamp = 0;
/* Have we sent a heartbeat message asking for reply, since last reply? */
static bool waiting_for_ping_response = false;
@@ -796,6 +797,14 @@ CreateReplicationSlot(CreateReplicationSlotCmd *cmd)
+ * Signal that we don't need the timeout mechanism. We're just
+ * creating the replication slot and don't yet accept feedback
+ * messages or send keepalives. As we possibly need to wait for
+ * further WAL the walsender would possibly be killed too soon.
+ last_reply_timestamp = 0;
/* build initial snapshot, might take a while */
@@ -1693,7 +1702,7 @@ WalSndComputeSleeptime(TimestampTz now)
long sleeptime = 10000; /* 10 s */
- if (wal_sender_timeout > 0)
+ if (wal_sender_timeout > 0 && last_reply_timestamp > 0)
@@ -1735,6 +1744,10 @@ WalSndCheckTimeOut(TimestampTz now)
+ /* don't bail out if we're doing something that doesn't require timeouts */
+ if (last_reply_timestamp <= 0)
timeout = TimestampTzPlusMilliseconds(last_reply_timestamp,
@@ -2879,7 +2892,11 @@ WalSndKeepaliveIfNecessary(TimestampTz now)
- if (wal_sender_timeout <= 0)
+ * Don't send keepalive messages if timeouts are globally disabled or
+ * we're doing something not partaking in timeouts.
+ if (wal_sender_timeout <= 0 || last_reply_timestamp <= 0)
Sent via pgsql-hackers mailing list (firstname.lastname@example.org)
To make changes to your subscription: