At Wed, 3 Mar 2021 14:56:25 -0800, Soumyadeep Chakraborty 
<soumyadeep2...@gmail.com> wrote in 
> On 2021/03/03 17:46, Heikki Linnakangas wrote:
> 
> > I think it should be reset even earlier, inside XlogReadTwoPhaseData()
> > probably. With your patch, doesn't the LogStandbySnapshot() call just
> > above where you're ressetting ThisTimeLineID also write a WAL record
> > with incorrect timeline?
> 
> Agreed.

Right.

> On Wed, Mar 3, 2021 at 1:04 AM Fujii Masao <masao.fu...@oss.nttdata.com> 
> wrote:
> 
> > > Even better, can we avoid setting ThisTimeLineID in 
> > > XlogReadTwoPhaseData() in the first place?
> >
> >
> >
> > Or isn't it better to reset ThisTimeLineID in read_local_xlog_page(), i.e.,
> > prevent read_local_xlog_page() from changing ThisTimeLineID? I'm not
> > sure if that's possible, though.. In the future other functions that calls
> > read_local_xlog_page() during the promotion may appear. Fixing the issue
> > outside read_local_xlog_page() may cause those functions to get
> > the same issue.
> 
> I agree. We should fix the issue in read_local_xlog_page(). I have
> attached two different patches which do so:
> saved_ThisTimeLineID.patch and pass_ThisTimeLineID.patch.
> 
> The former saves the value of the ThisTimeLineID before it gets changed
> in read_local_xlog_page() and resets it after ThisTimeLineID has been
> used later on in the code (by XLogReadDetermineTimeline()).
> 
> The latter removes occurrences of ThisTimeLineID from
> XLogReadDetermineTimeline() and introduces an argument currTLI to
> XLogReadDetermineTimeline() to be used in its stead.

read_local_xlog_page() works as a part of logical decoding and has
responsibility to update ThisTimeLineID properly.  As the comment in
the function, it is the proper place to update ThisTimeLineID since we
miss a timeline change if we check it earlier and the function uses
the value just after. So we cannot change that behavior of the
function.  That is, neither of them doesn't seem to be the right fix.

The confusion here is that there's two ThisTimeLineID's here. The
previous TLI for read and the next TLI to write.  Most part of the
function is needed to read a 2pc recrod so the ways we can take here
is:

1. Somehow tell the function not to update ThisTimeLineID in specific
  cases. This can be done by xlogreader private data but this doesn't
  seem reasonable.

2. Restore ThisTimeLineID after calling XLogReadRecord() in the
  *caller* side.  This is what came up to me first but I don't like
  this, too, but I don't find better fix.  way.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/access/transam/twophase.c b/src/backend/access/transam/twophase.c
index 80d2d20d6c..ab9ac0cd5a 100644
--- a/src/backend/access/transam/twophase.c
+++ b/src/backend/access/transam/twophase.c
@@ -1324,6 +1324,7 @@ XlogReadTwoPhaseData(XLogRecPtr lsn, char **buf, int *len)
 	XLogRecord *record;
 	XLogReaderState *xlogreader;
 	char	   *errormsg;
+	TimeLineID	currtli = ThisTimeLineID;
 
 	xlogreader = XLogReaderAllocate(wal_segment_size, NULL,
 									XL_ROUTINE(.page_read = &read_local_xlog_page,
@@ -1358,6 +1359,12 @@ XlogReadTwoPhaseData(XLogRecPtr lsn, char **buf, int *len)
 	memcpy(*buf, XLogRecGetData(xlogreader), sizeof(char) * XLogRecGetDataLen(xlogreader));
 
 	XLogReaderFree(xlogreader);
+
+	/*
+	 * This function reads a record not as a part of replication so we don't
+	 * want to let XLogReadRecord to change ThisTimeLineID.
+	 */
+	ThisTimeLineID = currtli;
 }
 
 

Reply via email to