Thank you for the comment. Three new patches are attached. I
forgot to give a revision number on the previous patch, but I
think this is the 2nd version.

1. walrcv_reply_fix_94_v2.patch
   Walreceiver patch applyable on master/REL9_4_STBLE/REL9_3_STABLE

2. walrcv_reply_fix_92_v2.patch
   Walreceiver patch applyable on REL9_2_STABLE

3. walrcv_reply_fix_91_v2.patch
   Walreceiver patch applyable on REL9_1_STABLE


At Sat, 14 Feb 2015 03:01:22 +0900, Fujii Masao <masao.fu...@gmail.com> wrote 
in <cahgqgwhz_4ywyoka+5ks9s_698adjh8+0hamcsc9wyfh37g...@mail.gmail.com>
> On Tue, Feb 10, 2015 at 7:48 PM, Kyotaro HORIGUCHI
> <horiguchi.kyot...@lab.ntt.co.jp> wrote:
> >> Introduce the maximum number of records that we can receive and
> >> process between feedbacks? For example, we can change
> >> XLogWalRcvSendHSFeedback() so that it's called per at least
> >> 8 records. I'm not sure what number is good, though...
> >
> > At the beginning of the "while(len > 0){if (len > 0){" block,
> > last_recv_timestamp is always kept up to date (by using
> > gettimeotda():). So we can use the variable instead of
> > gettimeofday() in my previous proposal.
> 
> Yes, but something like last_recv_timestamp doesn't exist in
> REL9_1_STABLE. So you don't think that your proposed fix
> should be back-patched to all supported versions. Right?

Back to 9.3 has the loop with the same structure. 9.2 is in a bit
differenct shape but looks to have the same issue. 9.1 and 9.0
has the same staps with 9.2 but 9.0 doesn't has wal sender
timeout and 9.1 does not have a variable having current time.

9.3 and later: the previous patch works, but revised as your
  comment.

9.2: The time of the last reply is stored in the file-scope
  variable reply_message, and the current time is stored in
  walrcv->lastMsgReceiptTime. The timeout is determined using
  theses variables.

9.1: walrcv doesn't have lastMsgReceiptTime. The current time
  should be acquired explicitly in the innermost loop. I tried
  calling gettimeofday() once per several loops. The skip count
  is determined by anticipated worst duration to process a wal
  record and wal_receiver_status_interval. However, I doubt the
  necessity to do so.. The value of the worst duration is simply
  a random guess.

9.0: No point to do this kind of fix.


> > The start time of the timeout could be last_recv_timestamp at the
> > beginning of the while loop, since it is a bit earlier than the
> > actual time at the point.
> 
> Sounds strange to me. I think that last_recv_timestamp should be
> compared with the time when the last feedback message was sent,
> e.g., maybe you can expose sendTime in XLogWalRcvSendReplay()
> as global variable, and use it to compare with last_recv_timestamp.

You're right to do exactly right thing, but, as you mentioned,
exposing sendTime is requied to do so. The solution I proposed is
the second-best assuming that wal_sender_timeout is recommended
to be longer enough (several times longer) than
wal_receiver_status_interval. If no one minds to expose sendTime,
it is the best solution. The attached patch does it.

# The added comment in the patch was wrong, though.

> When we get out of the WAL receive loop due to the timeout check
> which your patch added, XLogWalRcvFlush() is always executed.
> I don't think that current WAL file needs to be flushed at that time.

Thank you for pointing it out. Run XLogWalRcvSendReply(force =
true) immediately instead of breaking from the loop.

> > The another solution would be using
> > RegisterTimeout/enable_timeout_after and it seemed to be work for
> > me. It can throw out the time counting stuff in the loop we are
> > talking about and that of XLogWalRcvSendHSFeedback and
> > XLogWalRcvSendReply, but it might be a bit too large for the
> > gain.
> 
> Yes, sounds overkill.

However, gettimeofday() for each recv is also a overkill for most
cases. I'll post the patches for receivelog.c based on the patch
for 9.1 wal receiver.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 184f72ac34e7b787527dfa8ed76c1fbd2d970407 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp>
Date: Tue, 10 Feb 2015 14:56:23 +0900
Subject: [PATCH] Make walreceiver to keep regular reply message even on heavy
 load v2.

Wal receiver cannot send receiver reply message while it is receiving
continuous WAL stream caused by heavy load or something else. This
patch makes wal receiver to send reply message even on such a
situation.
---
 src/backend/replication/walreceiver.c | 59 ++++++++++++++++++++---------------
 1 file changed, 33 insertions(+), 26 deletions(-)

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index c2d4ed3..43d218d 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -91,6 +91,12 @@ static XLogSegNo recvSegNo = 0;
 static uint32 recvOff = 0;
 
 /*
+ * The time when the last wal receiver reply was sent. This is used to escape
+ * from receiving loop so that replay messages are kept regulary.
+ */
+static TimestampTz walRcvReplySendTime = 0;
+
+/*
  * Flags set by interrupt handlers of walreceiver for later service in the
  * main loop.
  */
@@ -423,32 +429,34 @@ WalReceiverMain(void)
 					 * Process the received data, and any subsequent data we
 					 * can read without blocking.
 					 */
-					for (;;)
+					while (len > 0)
 					{
-						if (len > 0)
-						{
-							/*
-							 * Something was received from master, so reset
-							 * timeout
-							 */
-							last_recv_timestamp = GetCurrentTimestamp();
-							ping_sent = false;
-							XLogWalRcvProcessMsg(buf[0], &buf[1], len - 1);
-						}
-						else if (len == 0)
-							break;
-						else if (len < 0)
-						{
-							ereport(LOG,
-									(errmsg("replication terminated by primary server"),
-									 errdetail("End of WAL reached on timeline %u at %X/%X.",
-											   startpointTLI,
-											   (uint32) (LogstreamResult.Write >> 32), (uint32) LogstreamResult.Write)));
-							endofwal = true;
-							break;
-						}
+						/*
+						 * Something was received from master, so reset
+						 * timeout
+						 */
+						last_recv_timestamp = GetCurrentTimestamp();
+						ping_sent = false;
+						XLogWalRcvProcessMsg(buf[0], &buf[1], len - 1);
+
+						/* Send reply if needed */
+						if (TimestampDifferenceExceeds(walRcvReplySendTime,
+											last_recv_timestamp,
+											wal_receiver_status_interval * 1000))
+							XLogWalRcvSendReply(true, false);
+
 						len = walrcv_receive(0, &buf);
 					}
+					if (len < 0)
+					{
+						ereport(LOG,
+								(errmsg("replication terminated by primary server"),
+								 errdetail("End of WAL reached on timeline %u at %X/%X.",
+										   startpointTLI,
+										   (uint32) (LogstreamResult.Write >> 32), (uint32) LogstreamResult.Write)));
+						endofwal = true;
+						break;
+					}
 
 					/* Let the master know that we received some data. */
 					XLogWalRcvSendReply(false, false);
@@ -1040,7 +1048,6 @@ XLogWalRcvSendReply(bool force, bool requestReply)
 	static XLogRecPtr writePtr = 0;
 	static XLogRecPtr flushPtr = 0;
 	XLogRecPtr	applyPtr;
-	static TimestampTz sendTime = 0;
 	TimestampTz now;
 
 	/*
@@ -1065,10 +1072,10 @@ XLogWalRcvSendReply(bool force, bool requestReply)
 	if (!force
 		&& writePtr == LogstreamResult.Write
 		&& flushPtr == LogstreamResult.Flush
-		&& !TimestampDifferenceExceeds(sendTime, now,
+		&& !TimestampDifferenceExceeds(walRcvReplySendTime, now,
 									   wal_receiver_status_interval * 1000))
 		return;
-	sendTime = now;
+	walRcvReplySendTime = now;
 
 	/* Construct a new message */
 	writePtr = LogstreamResult.Write;
-- 
2.1.0.GIT

>From ed60223555ae6b30c203276f1cf8890711e0c267 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp>
Date: Tue, 17 Feb 2015 19:00:07 +0900
Subject: [PATCH] Make walreceiver to keep regular reply message even on heavy
 load v2.

Wal receiver cannot send receiver reply message while it is receiving
continuous WAL stream caused by heavy load or something else. This
patch makes wal receiver to send reply message even on such a
situation.
---
 src/backend/replication/walreceiver.c | 17 +++++++++++++----
 1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 753316e..407e70b 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -319,13 +319,22 @@ WalReceiverMain(void)
 		/* Wait a while for data to arrive */
 		if (walrcv_receive(NAPTIME_PER_CYCLE, &type, &buf, &len))
 		{
-			/* Accept the received data, and process it */
-			XLogWalRcvProcessMsg(type, buf, len);
-
 			/* Receive any more data we can without sleeping */
-			while (walrcv_receive(0, &type, &buf, &len))
+			do
+			{
+				/* Accept the received data, and process it */
 				XLogWalRcvProcessMsg(type, buf, len);
 
+				/*
+				 * walrecv->lastMsgReceiptTime has the time as of
+				 * XLogWalRcvProcessMsg just above.
+				 */
+				if (TimestampDifferenceExceeds(reply_message.sendTime,
+											   walrcv->lastMsgReceiptTime,
+											   wal_receiver_status_interval * 1000))
+					XLogWalRcvSendReply();
+			} while (walrcv_receive(0, &type, &buf, &len));
+
 			/* Let the master know that we received some data. */
 			XLogWalRcvSendReply();
 
-- 
2.1.0.GIT

>From 64dbbd49719656eb3fbeb7fd8582a0fb952d54e9 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horiguchi.kyot...@lab.ntt.co.jp>
Date: Tue, 17 Feb 2015 19:00:07 +0900
Subject: [PATCH] Make walreceiver to keep regular reply message even on heavy
 load v2.

Wal receiver cannot send receiver reply message while it is receiving
continuous WAL stream caused by heavy load or something else. This
patch makes wal receiver to send reply message even on such a
situation.
---
 src/backend/replication/walreceiver.c | 31 ++++++++++++++++++++++++++++---
 1 file changed, 28 insertions(+), 3 deletions(-)

diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index 3dfc3d6..a3e1fd9 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -67,6 +67,10 @@ walrcv_send_type walrcv_send = NULL;
 walrcv_disconnect_type walrcv_disconnect = NULL;
 
 #define NAPTIME_PER_CYCLE 100	/* max sleep time between cycles (100ms) */
+#define WAL_PROCESS_WORST_DURATION 1 /* Anticipated worst duration to process
+									  * wal record in seconds. This is used to
+									  * calculate how often to check the time
+									  * to send reply message */
 
 /*
  * These variables are used similarly to openLogFile/Id/Seg/Off,
@@ -316,13 +320,34 @@ WalReceiverMain(void)
 		/* Wait a while for data to arrive */
 		if (walrcv_receive(NAPTIME_PER_CYCLE, &type, &buf, &len))
 		{
-			/* Accept the received data, and process it */
-			XLogWalRcvProcessMsg(type, buf, len);
+			/*
+			 * Check timeout once per several loops in order to avoid exccess
+			 * calling of gettimeofday().
+			 */
+			int check_interval = 
+				wal_receiver_status_interval / WAL_PROCESS_WORST_DURATION;
+			int count = 0;
 
 			/* Receive any more data we can without sleeping */
-			while (walrcv_receive(0, &type, &buf, &len))
+			do
+			{
+				/* Accept the received data, and process it */
 				XLogWalRcvProcessMsg(type, buf, len);
 
+				/*
+				 * walrecv->lastMsgReceiptTime has the time as of
+				 * XLogWalRcvProcessMsg just above.
+				 */
+				if (count++ >= check_interval)
+				{
+					if (TimestampDifferenceExceeds(reply_message.sendTime,
+										GetCurrentTimestamp(),
+										wal_receiver_status_interval * 1000))
+						XLogWalRcvSendReply();
+					count = 0;
+				}
+			} while (walrcv_receive(0, &type, &buf, &len));
+
 			/* Let the master know that we received some data. */
 			XLogWalRcvSendReply();
 
-- 
2.1.0.GIT

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to