Re: [HACKERS] Strange debug message of walreciver?

2015-03-13 Thread Tatsuo Ishii
 On Sun, Mar 8, 2015 at 8:13 PM, Tatsuo Ishii is...@postgresql.org wrote:

  On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii is...@postgresql.org
 wrote:
 
  When I set log_min_messages to debug5 and looked into walreciver log,
  I saw this:
 
  3600 2015-03-08 09:47:38 JST DEBUG:  sendtime 2015-03-08
  09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
  apply delay -1945478837 ms tran sfer latency 0 ms
 
  The replication apply delay -1945478837 ms part looks strange
  because the delay is below 0. The number is formatted as %d in elog
  call, and I suspect this is kind of integer overflow.
 
 
  Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed
 that
  the integer overflow occurs because sometimes the return of the
  GetCurrentChunkReplayStartTime() is 0 (zero).
 
  I added an elog into GetReplicationApplyDelay to check this info:
 
  DEBUG:  GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
  (0.00, 479099832352083.00)
  DEBUG:  sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
  00:17:12.352043-03 replication apply delay -1936504800 ms transfer
 latency
  0 ms
  DEBUG:  GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
  (0.00, 479099841450320.00)
  DEBUG:  sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
  00:17:21.450294-03 replication apply delay -1936495702 ms transfer
 latency
  0 ms

 Right. Until walreceiver gets the first WAL record to replay,
 xlogctl-currentChunkStartTime remains 0.

  Maybe we should check before and return zero from
 GetReplicationApplyDelay.
  The attached patch implement it.

 Your patch regards 0 replication apply delay in the case above which
 is confusing if the delay is actually 0.

 What about something like this to explicitly stats the delay data is
 not available?

 elog(DEBUG2, sendtime %s receipttime %s replication apply delay (N/A)
 transfer latency %d ms,

 
 Makes sense. Attached patch implement what you suggested.
 
 Looks good. If there's no objection, I will commit this.

Done.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange debug message of walreciver?

2015-03-08 Thread Tatsuo Ishii
 On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii is...@postgresql.org wrote:

 When I set log_min_messages to debug5 and looked into walreciver log,
 I saw this:

 3600 2015-03-08 09:47:38 JST DEBUG:  sendtime 2015-03-08
 09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
 apply delay -1945478837 ms tran sfer latency 0 ms

 The replication apply delay -1945478837 ms part looks strange
 because the delay is below 0. The number is formatted as %d in elog
 call, and I suspect this is kind of integer overflow.

 
 Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed that
 the integer overflow occurs because sometimes the return of the
 GetCurrentChunkReplayStartTime() is 0 (zero).
 
 I added an elog into GetReplicationApplyDelay to check this info:
 
 DEBUG:  GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
 (0.00, 479099832352083.00)
 DEBUG:  sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
 00:17:12.352043-03 replication apply delay -1936504800 ms transfer latency
 0 ms
 DEBUG:  GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
 (0.00, 479099841450320.00)
 DEBUG:  sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
 00:17:21.450294-03 replication apply delay -1936495702 ms transfer latency
 0 ms

Right. Until walreceiver gets the first WAL record to replay,
xlogctl-currentChunkStartTime remains 0.

 Maybe we should check before and return zero from GetReplicationApplyDelay.
 The attached patch implement it.

Your patch regards 0 replication apply delay in the case above which
is confusing if the delay is actually 0.

What about something like this to explicitly stats the delay data is
not available?

elog(DEBUG2, sendtime %s receipttime %s replication apply delay (N/A) transfer 
latency %d ms,

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange debug message of walreciver?

2015-03-08 Thread Tatsuo Ishii
 On Sun, Mar 8, 2015 at 8:13 PM, Tatsuo Ishii is...@postgresql.org wrote:

  On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii is...@postgresql.org
 wrote:
 
  When I set log_min_messages to debug5 and looked into walreciver log,
  I saw this:
 
  3600 2015-03-08 09:47:38 JST DEBUG:  sendtime 2015-03-08
  09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
  apply delay -1945478837 ms tran sfer latency 0 ms
 
  The replication apply delay -1945478837 ms part looks strange
  because the delay is below 0. The number is formatted as %d in elog
  call, and I suspect this is kind of integer overflow.
 
 
  Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed
 that
  the integer overflow occurs because sometimes the return of the
  GetCurrentChunkReplayStartTime() is 0 (zero).
 
  I added an elog into GetReplicationApplyDelay to check this info:
 
  DEBUG:  GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
  (0.00, 479099832352083.00)
  DEBUG:  sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
  00:17:12.352043-03 replication apply delay -1936504800 ms transfer
 latency
  0 ms
  DEBUG:  GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
  (0.00, 479099841450320.00)
  DEBUG:  sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
  00:17:21.450294-03 replication apply delay -1936495702 ms transfer
 latency
  0 ms

 Right. Until walreceiver gets the first WAL record to replay,
 xlogctl-currentChunkStartTime remains 0.

  Maybe we should check before and return zero from
 GetReplicationApplyDelay.
  The attached patch implement it.

 Your patch regards 0 replication apply delay in the case above which
 is confusing if the delay is actually 0.

 What about something like this to explicitly stats the delay data is
 not available?

 elog(DEBUG2, sendtime %s receipttime %s replication apply delay (N/A)
 transfer latency %d ms,

 
 Makes sense. Attached patch implement what you suggested.

Looks good. If there's no objection, I will commit this.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange debug message of walreciver?

2015-03-08 Thread Fabrízio de Royes Mello
On Sun, Mar 8, 2015 at 8:13 PM, Tatsuo Ishii is...@postgresql.org wrote:

  On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii is...@postgresql.org
wrote:
 
  When I set log_min_messages to debug5 and looked into walreciver log,
  I saw this:
 
  3600 2015-03-08 09:47:38 JST DEBUG:  sendtime 2015-03-08
  09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
  apply delay -1945478837 ms tran sfer latency 0 ms
 
  The replication apply delay -1945478837 ms part looks strange
  because the delay is below 0. The number is formatted as %d in elog
  call, and I suspect this is kind of integer overflow.
 
 
  Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed
that
  the integer overflow occurs because sometimes the return of the
  GetCurrentChunkReplayStartTime() is 0 (zero).
 
  I added an elog into GetReplicationApplyDelay to check this info:
 
  DEBUG:  GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
  (0.00, 479099832352083.00)
  DEBUG:  sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
  00:17:12.352043-03 replication apply delay -1936504800 ms transfer
latency
  0 ms
  DEBUG:  GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
  (0.00, 479099841450320.00)
  DEBUG:  sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
  00:17:21.450294-03 replication apply delay -1936495702 ms transfer
latency
  0 ms

 Right. Until walreceiver gets the first WAL record to replay,
 xlogctl-currentChunkStartTime remains 0.

  Maybe we should check before and return zero from
GetReplicationApplyDelay.
  The attached patch implement it.

 Your patch regards 0 replication apply delay in the case above which
 is confusing if the delay is actually 0.

 What about something like this to explicitly stats the delay data is
 not available?

 elog(DEBUG2, sendtime %s receipttime %s replication apply delay (N/A)
transfer latency %d ms,


Makes sense. Attached patch implement what you suggested.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index bfbc02f..9c7710f 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1191,15 +1191,26 @@ ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime)
 	{
 		char	   *sendtime;
 		char	   *receipttime;
+		int			applyDelay;
 
 		/* Copy because timestamptz_to_str returns a static buffer */
 		sendtime = pstrdup(timestamptz_to_str(sendTime));
 		receipttime = pstrdup(timestamptz_to_str(lastMsgReceiptTime));
-		elog(DEBUG2, sendtime %s receipttime %s replication apply delay %d ms transfer latency %d ms,
-			 sendtime,
-			 receipttime,
-			 GetReplicationApplyDelay(),
-			 GetReplicationTransferLatency());
+		applyDelay = GetReplicationApplyDelay();
+
+		/* apply delay is not available */
+		if (applyDelay == -1)
+			elog(DEBUG2, sendtime %s receipttime %s replication apply delay (N/A) transfer latency %d ms,
+ sendtime,
+ receipttime,
+ GetReplicationTransferLatency());
+		else
+			elog(DEBUG2, sendtime %s receipttime %s replication apply delay %d ms transfer latency %d ms,
+ sendtime,
+ receipttime,
+ applyDelay,
+ GetReplicationTransferLatency());
+
 		pfree(sendtime);
 		pfree(receipttime);
 	}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 496605f..b26f5fc 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -314,7 +314,8 @@ GetWalRcvWriteRecPtr(XLogRecPtr *latestChunkStart, TimeLineID *receiveTLI)
 }
 
 /*
- * Returns the replication apply delay in ms
+ * Returns the replication apply delay in ms or -1
+ * if the apply delay info is not available
  */
 int
 GetReplicationApplyDelay(void)
@@ -328,6 +329,8 @@ GetReplicationApplyDelay(void)
 	long		secs;
 	int			usecs;
 
+	TimestampTz	chunckReplayStartTime;
+
 	SpinLockAcquire(walrcv-mutex);
 	receivePtr = walrcv-receivedUpto;
 	SpinLockRelease(walrcv-mutex);
@@ -337,7 +340,12 @@ GetReplicationApplyDelay(void)
 	if (receivePtr == replayPtr)
 		return 0;
 
-	TimestampDifference(GetCurrentChunkReplayStartTime(),
+	chunckReplayStartTime = GetCurrentChunkReplayStartTime();
+
+	if (chunckReplayStartTime == 0)
+		return -1;
+
+	TimestampDifference(chunckReplayStartTime,
 		GetCurrentTimestamp(),
 		secs, usecs);
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Strange debug message of walreciver?

2015-03-07 Thread Tatsuo Ishii
When I set log_min_messages to debug5 and looked into walreciver log,
I saw this:

3600 2015-03-08 09:47:38 JST DEBUG:  sendtime 2015-03-08 09:47:38.31493+09 
receipttime 2015-03-08 09:47:38.315027+09 replication apply delay -1945478837 
ms tran sfer latency 0 ms

The replication apply delay -1945478837 ms part looks strange
because the delay is below 0. The number is formatted as %d in elog
call, and I suspect this is kind of integer overflow.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Strange debug message of walreciver?

2015-03-07 Thread Fabrízio de Royes Mello
On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii is...@postgresql.org wrote:

 When I set log_min_messages to debug5 and looked into walreciver log,
 I saw this:

 3600 2015-03-08 09:47:38 JST DEBUG:  sendtime 2015-03-08
09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
apply delay -1945478837 ms tran sfer latency 0 ms

 The replication apply delay -1945478837 ms part looks strange
 because the delay is below 0. The number is formatted as %d in elog
 call, and I suspect this is kind of integer overflow.


Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed that
the integer overflow occurs because sometimes the return of the
GetCurrentChunkReplayStartTime() is 0 (zero).

I added an elog into GetReplicationApplyDelay to check this info:

DEBUG:  GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
(0.00, 479099832352083.00)
DEBUG:  sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
00:17:12.352043-03 replication apply delay -1936504800 ms transfer latency
0 ms
DEBUG:  GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
(0.00, 479099841450320.00)
DEBUG:  sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
00:17:21.450294-03 replication apply delay -1936495702 ms transfer latency
0 ms

Maybe we should check before and return zero from GetReplicationApplyDelay.
The attached patch implement it.

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog: http://fabriziomello.github.io
 Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello
 Github: http://github.com/fabriziomello
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 496605f..0c45b66 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -328,6 +328,8 @@ GetReplicationApplyDelay(void)
 	long		secs;
 	int			usecs;
 
+	TimestampTz	chunckReplayStartTime;
+
 	SpinLockAcquire(walrcv-mutex);
 	receivePtr = walrcv-receivedUpto;
 	SpinLockRelease(walrcv-mutex);
@@ -337,7 +339,12 @@ GetReplicationApplyDelay(void)
 	if (receivePtr == replayPtr)
 		return 0;
 
-	TimestampDifference(GetCurrentChunkReplayStartTime(),
+	chunckReplayStartTime = GetCurrentChunkReplayStartTime();
+
+	if (chunckReplayStartTime == 0)
+		return 0;
+
+	TimestampDifference(chunckReplayStartTime,
 		GetCurrentTimestamp(),
 		secs, usecs);
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers