Hello Michael,

On Wed, 24 Jul 2019 09:49:05 +0900
Michael Paquier <mich...@paquier.xyz> wrote:

> On Tue, Jul 23, 2019 at 06:05:18PM +0200, Jehan-Guillaume de Rorthais wrote:
> > Please, find in attachment a first trivial patch to support
> > pg_walfile_name() and pg_walfile_name_offset() on a standby.
> > Previous restriction on this functions seems related to ThisTimeLineID not
> > being safe on standby. This patch is fetching the timeline from
> > WalRcv->receivedTLI using GetWalRcvWriteRecPtr(). As far as I understand,
> > this is updated each time some data are flushed to the WAL.  
[...]
> Your patch does not count for the case of archive recovery, where
> there is no WAL receiver, and as the shared memory state of the WAL
> receiver is not set 0 would be set.

Indeed. I tested this topic with the following query and was fine with the
NULL result:

  select pg_walfile_name(pg_last_wal_receive_lsn());

I was fine with this result because my use case requires replication anyway. A
NULL result would mean that the node never streamed from the old primary since
its last startup, so a failover should ignore it anyway.

However, NULL just comes from pg_last_wal_receive_lsn() here. The following
query result is wrong:

  > select pg_walfile_name('0/1')
  000000000000000000000000

I fixed that. See patch 0001-v2-* in attachement


> The replay timeline is something we could use here instead via
> GetXLogReplayRecPtr(). CreateRestartPoint actually takes the latest WAL
> receiver or replayed point for its end LSN position, whichever is newer.

I did consider GetXLogReplayRecPtr() or even XLogCtl->replayEndTLI (which is
updated right before the replay). However, both depend on read activity on the
standby. That's why I picked WalRcv->receivedTLI which is updated whatever the
reading activity on the standby.

> > Last, I plan to produce an extension to support this on older release. Is
> > it something that could be integrated in official source tree during a minor
> > release or should I publish it on eg. pgxn?  
> 
> Unfortunately no.  This is a behavior change so it cannot find its way
> into back branches.

Yes, my patch is a behavior change. But here, I was yalking about an
extension, not the core itself, to support this feature in older releases.

> The WAL receiver state is in shared memory and published, so that's easy
> enough to get.  We don't do that for XLogCtl unfortunately.

Both are in shared memory, but WalRcv have a public function to get its
receivedTLI member.

XLogCtl has nothing in public to expose its ThisTimeLineID member. However, from
a module, I'm able to fetch it using:

  XLogCtl = ShmemInitStruct("XLOG Ctl", XLOGShmemSize(), &found);
  SpinLockAcquire(&XLogCtl->info_lck);
  tl = XLogCtl->ThisTimeLineID;
  SpinLockRelease(&XLogCtl->info_lck);

As the "XLOG Ctl" index entry already exists in shmem, ShmemInitStruct returns
the correct structure from there. Not sure this was supposed to be used this
way though...Adding a public function might be cleaner, but it will not help
for older releases.

> I think that there are arguments for being more flexible with it, and perhaps
> have a system-level view to be able to look at some of its fields.

Great idea. I'll give it a try to keep the discussion on.

> There is also a downside with get_controlfile(), which is that it
> fetches directly the data from the on-disk pg_control, and
> post-recovery this only gets updated at the first checkpoint.

Indeed, that's why I started this patch and thread.

Thanks,
>From fdf133645b8cc2728cca3677e71bdd5cb69cdbd4 Mon Sep 17 00:00:00 2001
From: Jehan-Guillaume de Rorthais <j...@dalibo.com>
Date: Tue, 23 Jul 2019 17:28:44 +0200
Subject: [PATCH] Support pg_walfile_name on standby

Support executing both SQL functions pg_walfile_name() and
pg_walfile_name_offset() on a standby.
---
 src/backend/access/transam/xlogfuncs.c | 39 +++++++++++++++++---------
 1 file changed, 25 insertions(+), 14 deletions(-)

diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index b35043bf71..86c4d8382b 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -460,13 +460,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	TupleDesc	resultTupleDesc;
 	HeapTuple	resultHeapTuple;
 	Datum		result;
-
-	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("%s cannot be executed during recovery.",
-						 "pg_walfile_name_offset()")));
+	TimeLineID  tl;
 
 	/*
 	 * Construct a tuple descriptor for the result row.  This must match this
@@ -480,11 +474,24 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 
 	resultTupleDesc = BlessTupleDesc(resultTupleDesc);
 
+	if (RecoveryInProgress())
+	{
+		GetWalRcvWriteRecPtr(NULL, &tl);
+
+		if (!tl)
+		{
+			isnull[0] = isnull[1] = true;
+			goto result;
+		}
+	}
+	else
+		tl = ThisTimeLineID;
+
 	/*
 	 * xlogfilename
 	 */
 	XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size);
-	XLogFileName(xlogfilename, ThisTimeLineID, xlogsegno, wal_segment_size);
+	XLogFileName(xlogfilename, tl, xlogsegno, wal_segment_size);
 
 	values[0] = CStringGetTextDatum(xlogfilename);
 	isnull[0] = false;
@@ -497,6 +504,7 @@ pg_walfile_name_offset(PG_FUNCTION_ARGS)
 	values[1] = UInt32GetDatum(xrecoff);
 	isnull[1] = false;
 
+result:
 	/*
 	 * Tuple jam: Having first prepared your Datums, then squash together
 	 */
@@ -517,16 +525,19 @@ pg_walfile_name(PG_FUNCTION_ARGS)
 	XLogSegNo	xlogsegno;
 	XLogRecPtr	locationpoint = PG_GETARG_LSN(0);
 	char		xlogfilename[MAXFNAMELEN];
+	TimeLineID  tl;
 
 	if (RecoveryInProgress())
-		ereport(ERROR,
-				(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-				 errmsg("recovery is in progress"),
-				 errhint("%s cannot be executed during recovery.",
-						 "pg_walfile_name()")));
+	{
+		GetWalRcvWriteRecPtr(NULL, &tl);
+		if (!tl)
+			PG_RETURN_NULL();
+	}
+	else
+		tl = ThisTimeLineID;
 
 	XLByteToPrevSeg(locationpoint, xlogsegno, wal_segment_size);
-	XLogFileName(xlogfilename, ThisTimeLineID, xlogsegno, wal_segment_size);
+	XLogFileName(xlogfilename, tl, xlogsegno, wal_segment_size);
 
 	PG_RETURN_TEXT_P(cstring_to_text(xlogfilename));
 }
-- 
2.20.1

Reply via email to