On Sun, Mar 8, 2015 at 8:13 PM, Tatsuo Ishii <[email protected]> wrote:
>
> > On Sat, Mar 7, 2015 at 10:18 PM, Tatsuo Ishii <[email protected]>
wrote:
> >>
> >> When I set log_min_messages to debug5 and looked into walreciver log,
> >> I saw this:
> >>
> >> 3600 2015-03-08 09:47:38 JST DEBUG: sendtime 2015-03-08
> > 09:47:38.31493+09 receipttime 2015-03-08 09:47:38.315027+09 replication
> > apply delay -1945478837 ms tran sfer latency 0 ms
> >>
> >> The "replication apply delay -1945478837 ms" part looks strange
> >> because the delay is below 0. The number is formatted as %d in elog
> >> call, and I suspect this is kind of integer overflow.
> >>
> >
> > Looking at GetReplicationApplyDelay() in walreceiverfuncs.c I noticed
that
> > the integer overflow occurs because sometimes the return of the
> > GetCurrentChunkReplayStartTime() is 0 (zero).
> >
> > I added an elog into GetReplicationApplyDelay to check this info:
> >
> > DEBUG: GetReplicationApplyDelay 479099832 seconds, 352 milliseconds,
> > (0.000000, 479099832352083.000000)
> > DEBUG: sendtime 2015-03-08 00:17:12.351987-03 receipttime 2015-03-08
> > 00:17:12.352043-03 replication apply delay -1936504800 ms transfer
latency
> > 0 ms
> > DEBUG: GetReplicationApplyDelay 479099841 seconds, 450 milliseconds,
> > (0.000000, 479099841450320.000000)
> > DEBUG: sendtime 2015-03-08 00:17:21.45018-03 receipttime 2015-03-08
> > 00:17:21.450294-03 replication apply delay -1936495702 ms transfer
latency
> > 0 ms
>
> Right. Until walreceiver gets the first WAL record to replay,
> xlogctl->currentChunkStartTime remains 0.
>
> > Maybe we should check before and return zero from
GetReplicationApplyDelay.
> > The attached patch implement it.
>
> Your patch regards 0 replication apply delay in the case above which
> is confusing if the delay is actually 0.
>
> What about something like this to explicitly stats the delay data is
> not available?
>
> elog(DEBUG2, "sendtime %s receipttime %s replication apply delay (N/A)
transfer latency %d ms",
>
Makes sense. Attached patch implement what you suggested.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/src/backend/replication/walreceiver.c b/src/backend/replication/walreceiver.c
index bfbc02f..9c7710f 100644
--- a/src/backend/replication/walreceiver.c
+++ b/src/backend/replication/walreceiver.c
@@ -1191,15 +1191,26 @@ ProcessWalSndrMessage(XLogRecPtr walEnd, TimestampTz sendTime)
{
char *sendtime;
char *receipttime;
+ int applyDelay;
/* Copy because timestamptz_to_str returns a static buffer */
sendtime = pstrdup(timestamptz_to_str(sendTime));
receipttime = pstrdup(timestamptz_to_str(lastMsgReceiptTime));
- elog(DEBUG2, "sendtime %s receipttime %s replication apply delay %d ms transfer latency %d ms",
- sendtime,
- receipttime,
- GetReplicationApplyDelay(),
- GetReplicationTransferLatency());
+ applyDelay = GetReplicationApplyDelay();
+
+ /* apply delay is not available */
+ if (applyDelay == -1)
+ elog(DEBUG2, "sendtime %s receipttime %s replication apply delay (N/A) transfer latency %d ms",
+ sendtime,
+ receipttime,
+ GetReplicationTransferLatency());
+ else
+ elog(DEBUG2, "sendtime %s receipttime %s replication apply delay %d ms transfer latency %d ms",
+ sendtime,
+ receipttime,
+ applyDelay,
+ GetReplicationTransferLatency());
+
pfree(sendtime);
pfree(receipttime);
}
diff --git a/src/backend/replication/walreceiverfuncs.c b/src/backend/replication/walreceiverfuncs.c
index 496605f..b26f5fc 100644
--- a/src/backend/replication/walreceiverfuncs.c
+++ b/src/backend/replication/walreceiverfuncs.c
@@ -314,7 +314,8 @@ GetWalRcvWriteRecPtr(XLogRecPtr *latestChunkStart, TimeLineID *receiveTLI)
}
/*
- * Returns the replication apply delay in ms
+ * Returns the replication apply delay in ms or -1
+ * if the apply delay info is not available
*/
int
GetReplicationApplyDelay(void)
@@ -328,6 +329,8 @@ GetReplicationApplyDelay(void)
long secs;
int usecs;
+ TimestampTz chunckReplayStartTime;
+
SpinLockAcquire(&walrcv->mutex);
receivePtr = walrcv->receivedUpto;
SpinLockRelease(&walrcv->mutex);
@@ -337,7 +340,12 @@ GetReplicationApplyDelay(void)
if (receivePtr == replayPtr)
return 0;
- TimestampDifference(GetCurrentChunkReplayStartTime(),
+ chunckReplayStartTime = GetCurrentChunkReplayStartTime();
+
+ if (chunckReplayStartTime == 0)
+ return -1;
+
+ TimestampDifference(chunckReplayStartTime,
GetCurrentTimestamp(),
&secs, &usecs);
--
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers