On Mon, Nov 8, 2021 at 8:20 AM Amul Sul <sula...@gmail.com> wrote: > Attached is the rebased version of refactoring as well as the > pg_prohibit_wal feature patches for the latest master head (commit # > 39a3105678a).
I spent a lot of time today studying 0002, and specifically the question of whether EndOfLog must be the same as XLogCtl->replayEndRecPtr and whether EndOfLogTLI must be the same as XLogCtl->replayEndTLI. The answer to the former question is "no" because, if we don't enter redo, XLogCtl->replayEndRecPtr won't be initialized at all. If we do enter redo, then I think it has to be the same unless something very weird happens. EndOfLog gets set like this: XLogBeginRead(xlogreader, LastRec); record = ReadRecord(xlogreader, PANIC, false, replayTLI); EndOfLog = EndRecPtr; In every case that exists in our regression tests, EndRecPtr is the same before these three lines of code as it is afterward. However, if you test with recovery_target=immediate, you can get it to be different, because in that case we drop out of the redo loop after calling recoveryStopsBefore() rather than after calling recoveryStopsAfter(). Similarly I'm fairly sure that if you use recovery_target_inclusive=off you can likewise get it to be different (though I discovered the hard way that recovery_target_inclusive=off is ignored when you use recovery_target_name). It seems like a really bad thing that neither recovery_target=immediate nor recovery_target_inclusive=off have any tests, and I think we ought to add some. Anyway, in effect, these three lines of code have the effect of backing up the xlogreader by one record when we stop before rather than after a record that we're replaying. What that means is that EndOfLog is going to be the end+1 of the last record that we actually replayed. There might be one more record that we read but did not replay, and that record won't impact the value we end up with in EndOfLog. Now, XLogCtl->replayEndRecPtr is also that end+1 of the last record that we actually replayed. To put that another way, there's no way to exit the main redo loop after we set XLogCtl->replayEndRecPtr and before we change LastRec. So in the cases where XLogCtl->replayEndRecPtr gets initialized at all, it can only be different from EndOfLog if something different happens when we re-read the last-replayed WAL record than what happened when we read it the first time. That seems unlikely, and would be unfortunate it if it did happen. I am inclined to think that it might be better not to reread the record at all, though. As far as this patch goes, I think we need a solution that doesn't involve fetching EndOfLog from a variable that's only sometimes initialized and then not doing anything with it except in the cases where it was initialized. As for EndOfLogTLI, I'm afraid I don't think that's the same thing as XLogCtl->replayEndTLI. Now, it's hard to be sure, because I don't think the regression tests contain any scenarios where we run recovery and the values end up different. However, I think that the code sets EndOfLogTLI to the TLI of the last WAL file that we read, and I think XLogCtl->replayEndTLI gets set to the timeline from which that WAL record originated. So imagine that we are looking for WAL that ought to be in 000000010000000000000003 but we don't find it; instead we find 000000020000000000000003 because our recovery target timeline is 2, or something that has 2 in its history. We will read the WAL for timeline 1 from this file which has timeline 2 in the file name. I think if recovery ends in this file before the timeline switch, these values will be different. I did not try to construct a test case for this today due to not having enough time, so it's possible that I'm wrong about this, but that's how it looks to me from the code. -- Robert Haas EDB: http://www.enterprisedb.com