On Mon, May 10, 2021 at 4:35 AM Kyotaro Horiguchi <horikyota....@gmail.com> wrote: > It seems to me the issue here is not a race condition but > WaitForWALToBecomeAvailable initializing expectedTLEs with the history > of a improper timeline. So using recoveryTargetTLI instead of > receiveTLI for the case fixes this issue.
I agree. > I believe the 004_timeline_switch.pl detects your issue. And the > attached change fixes it. So why does this use recoveryTargetTLI instead of receiveTLI only conditionally? Why not do it all the time? The hard thing about this code is that the assumptions are not very clear. If we don't know why something is a certain way, then we might break things if we change it. Worse yet, if nobody else knows why it's like that either, then who knows what assumptions they might be making? It's hard to be sure that any change is safe. But that being said, we have a clear definition from the comments for what expectedTLEs is supposed to contain, and it's only going to end up with those contents if we initialize it from recoveryTargetTLI. So I am inclined to think that we ought to do that always, and if it breaks something, then that's a sign that some other part of the code also needs fixing, because apparently that hypothetical other part of the code doesn't work if expctedTLEs contains what the comments say that it should. Now maybe that's the wrong idea. But if so, then we're saying that the definition of expectedTLEs needs to be changed, and we should update the comments with the new definition, whatever it is. A lot of the confusion here results from the fact that the code and comments are inconsistent and we can't tell whether that's intentional or inadvertent. Let's not leave the next person who looks at this code wondering the same thing about whatever changes we make. -- Robert Haas EDB: http://www.enterprisedb.com