On Sat, Jan 23, 2021 at 10:06 AM Dilip Kumar <[email protected]> wrote: > > But I am afraid that whether this adjustment (setting based on > receiveTLI) is done based on some analysis or part of some bug fix. I > will try to find the history of this and maybe based on that we can > make a better decision.
I have done further analysis of this, so this of initializing the expectedTLEs from receiveTLI instead of recoveryTargetTLI is done in below commit. ===== ee994272ca50f70b53074f0febaec97e28f83c4e Author: Heikki Linnakangas <[email protected]> 2013-01-03 14:11:58 Committer: Heikki Linnakangas <[email protected]> 2013-01-03 14:11:58 Delay reading timeline history file until it's fetched from master. Streaming replication can fetch any missing timeline history files from the master, but recovery would read the timeline history file for the target timeline before reading the checkpoint record, and before walreceiver has had a chance to fetch it from the master. Delay reading it, and the sanity checks involving timeline history, until after reading the checkpoint record. There is at least one scenario where this makes a difference: if you take a base backup from a standby server right after a timeline switch, the WAL segment containing the initial checkpoint record will begin with an older timeline ID. Without the timeline history file, recovering that file will fail as the older timeline ID is not recognized to be an ancestor of the target timeline. If you try to recover from such a backup, using only streaming replication to fetch the WAL, this patch is required for that to work. ===== I did not understand one point about this commit message, "Without the timeline history file, recovering that file will fail as the older timeline ID is not recognized to be an ancestor of the target timeline. " I mean in which case this will be true? Now the problem is that we have initialized the expectTLEs based on the older timeline history file instead of recoveryTargetTLI, which is breaking the sanity of expectedTLEs. So in below function (rescanLatestTimeLine), if we find the newest TLI is same as recoveryTargetTLI then we assume we don't need to do anything but the problem is expectedTLEs is set to wrong target and we never update unless again timeline changes. So I think first we need to identify what the above commit is trying to achieve and then can we do it in a better way without breaking the sanity of expectedTLEs. rescanLatestTimeLine(void) { .... newtarget = findNewestTimeLine(recoveryTargetTLI); if (newtarget == recoveryTargetTLI) { /* No new timelines found */ return false; } ... newExpectedTLEs = readTimeLineHistory(newtarget); ... expectedTLEs = newExpectedTLEs; } -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
