On Mon, Nov 1, 2021 at 11:33 PM Michael Paquier <mich...@paquier.xyz> wrote: > + /* > + * If we're writing and flushing WAL, the time line can't be changing, > + * so no lock is required. > + */ > + if (insertTLI) > + *insertTLI = XLogCtl->ThisTimeLineID; > In 0002, there is no downside in putting that in the spinlock section > either, no? It seems to me that we should be able to call this one > even in recovery, as flush LSNs exist while in recovery.
I don't think it matters very much from a performance point of view, although putting a branch inside of a spinlock-protected section may be undesirable. My bigger issues with this kind of idea is that we don't want to use spinlocks as a "security blanket" (as in Linus from Peanuts). Every time we use a spinlock, or any other kind of locking mechanism, we should have a clear idea what we're protecting ourselves against. I'm quite suspicious that there are a number of places where we're taking spinlocks in xlog.c just to feel virtuous, and not because it does anything useful, and I don't particularly want to increase the number of such places. Also, XLogCtl->ThisTimeLineID isn't set until the end of recovery, so calling this function during recovery would be a mistake. There seem to be a number of interface functions in xlog.c that should only be called when not in recovery, and neither their names nor their comments make that clear. I wasn't bold enough to go label all the ones I think fall into that category, but maybe we should. Honestly, the naming of things in this file is annoyingly bad in general. My favorite example, found during this project, is: #define ConvertToXSegs(x, segsize) XLogMBVarToSegs((x), (segsize)) It's not good enough to have one name that's difficult to understand or remember or capitalize properly -- let's have TWO -- for the same thing! > In 0007, XLogFileClose() should reset openLogTLI. The same can be > said in BootStrapXLOG() before creating the control file. It may be > cleaner here to introduce an InvalidTimelineId, as discussed a couple > of days ago. I thought about that, but I think it's pointless. I think we can just say that if openLogFile == -1, openLogSegNo and openLogTLI are meaningless. I don't think we're currently resetting openLogSegNo to an invalid value either, so I don't think openLogTLI should be treated differently. I'm not opposed to introducing InvalidTimeLineID on principle, and I looked for a way of doing it in this patch set, but it didn't seem all that valuable, and I feel that someone who cares about it can do it as a separate effort after I commit these. Honestly, I think these patches greatly reduce the need to be afraid of leaving the TLI unset, because they remove the crutch of hoping that ThisTimeLineID is initialized to the proper value. You actually have to think about which TLI you are going to pass. Now I'm very sure further improvement is possible, but I think this patch set is complex enough already, and I'd like to get it committed before contemplating any other work in this area. > Some comments at the top of recoveryTargetTLIRequested need a refresh > with their references to ThisTimelineID. After thinking this over, I think we should just delete the entire first paragraph, so that the comments start like this: /* * recoveryTargetTimeLineGoal: what the user requested, if any * * recoveryTargetTLIRequested: numeric value of requested timeline, if constant I don't see that we'd be losing anything that way. When you look at that paragraph now, the first sentence is just confusing the issue, because we actually don't care about XLogCtl->ThisTimeLineID during recovery, because it's not set. We do care about plain old ThisTimeLineID, but that's going to be demoted to a local variable by these patches, and there just doesn't seem to be any reason to discuss it here. The second sentence is wrong already, because the rmgr routines do not rely on ThisTimeLineID. And the third sentence is also wrong and/or confusing, because not all of the variables that follow are TLIs -- only 3 of 5 are. I don't actually think there's any useful introduction we need to provide here; we just need to tell people what the variables we're about to declare do. > The handling of XLogCtl->ThisTimeLineID still seems a bit blurry when > it comes to timeline switches because we don't update it while in > recovery when replaying a end-of-recovery record. This could at least > be documented better. We don't update it during recovery at all. There's exactly one assignment statement for that variable and it's here: /* Save the selected TimeLineID in shared memory, too */ XLogCtl->ThisTimeLineID = ThisTimeLineID; XLogCtl->PrevTimeLineID = PrevTimeLineID; That's after the main redo loop completes. -- Robert Haas EDB: http://www.enterprisedb.com