On Thu, Nov 18, 2021 at 3:31 AM Robert Haas <robertmh...@gmail.com> wrote: > > I spent a lot of time trying to figure out why xlog.c has global > variables ReadRecPtr and EndRecPtr instead of just relying on the > eponymous structure members inside the XLogReaderState. I concluded > that the values are the same at most points in the code, and thus that > we could just use xlogreaderstate->{Read,End}RecPtr instead. There are > two places where this wouldn't produce the same results we're getting > today. Both of those appear to be bugs. > > The reason why it's generally the case that ReadRecPtr == > xlogreaderstate->ReadRecPtr and likewise for EndRecPtr is that > ReadRecord() calls XLogReadRecord(), which sets the values inside the > XLogReaderState, and then immediately assigns the values stored there > to the global variables. There's no other code that changes the other > global variables, and the only other code that changes the structure > members is XLogBeginRead(). So the values can be unequal from the time > XLogBeginRead() is called up until the time that the XLogReadRecord() > call inside ReadRecord() returns. In practice, StartupXLOG() always > calls ReadRecord() right after it calls XLogBeginRead(), and > ReadRecord() does not reference either global variable before calling > XLogReadRecord(), so the problem surface is limited to code that runs > underneath XLogReadRecord(). XLogReadRecord() is part of xlogreader.c, > but it uses a callback interface: the callback is XLogPageRead(), > which itself references EndRecPtr, and also calls > WaitForWALToBecomeAvailable(), which in turn calls > rescanLatestTimeLine(), which also references EndRecPtr. So these are > the two problem cases: XLogPageRead(), and rescanLatestTimeLine(). > > In rescanLatestTimeLine(), the problem is IMHO probably serious enough > to justify a separate commit with back-patching. The problem is that > EndRecPtr is being used here to reject impermissible attempts to > switch to a bad timeline, but if pg_wal starts out empty, EndRecPtr > will be 0 here, which causes the code to fail to detect a case that > should be prohibited. Consider the following test case: > > - create a primary > - create standby #1 from the primary > - start standby #1 and promote it > - take a backup from the primary using -Xnone to create standby #2 > - clear primary_conninfo on standby #2 and then start it > - copy 00000002.history from standby #1 to standby #2 > > You get: > > 2021-11-17 15:34:26.213 EST [7474] LOG: selected new timeline ID: 2 > > But with the attached patch, you get: > > 2021-11-17 16:12:01.566 EST [20900] LOG: new timeline 2 forked off > current database system timeline 1 before current recovery point > 0/A000060 >
Somehow with and without patch I am getting the same log. > Had the problem occurred at some later point in the WAL stream rather > than before fetching the very first record, I think everything is > fine; at that point, I think that the global variable EndRecPtr will > be initialized. I'm not entirely sure that it contains exactly the > right value, but it's someplace in the right ballpark, at least. > Agree, change seems pretty much reasonable. > In XLogPageRead(), the problem is just cosmetic. We're only using > EndRecPtr as an argument to emode_for_corrupt_record(), which is all > about suppressing duplicate complaints about the same LSN. But if the > xlogreader has been repositioned using XLogBeginRead() since the last > call to ReadRecord(), or if there are no preceding calls to > ReadRecord(), then the value of EndRecPtr here is left over from the > previous read position and is not particularly related to the record > we're reading now. xlogreader->EndRecPtr, OTOH, is. This doesn't seem > worth a separate commit to me, or a back-patch, but it seems worth > fixing while I'm cleaning up these global variables. > LGTM. Regards, Amul