On Thu, Sep 26, 2019 at 5:15 PM Masahiko Sawada <sawada.m...@gmail.com> wrote: > > Hi, > > When we do archive recovery from the database cluster of which > timeline ID is more than 2 pg_wal/RECOVERYHISTORY is remained even > after archive recovery completed. > > The cause of this seems cbc55da556b that moved exitArchiveRecovery() > to before writeTimeLineHistory(). writeTimeLineHIstory() restores the > history file from archive directory and therefore creates > RECOVERYHISTORY file in pg_wal directory. We used to remove such > temporary file by exitArchiveRecovery() but with this commit the order > of calling these functions is reversed. Therefore we create > RECOVERYHISTORY file after exited from archive recovery mode and > remain it. > > To fix it I think that we can remove RECOVERYHISTORY file before the > history file is archived in writeTimeLineHIstory(). The commit > cbc55da556b is intended to minimize the window between the moment the > file is written and the end-of-recovery record is generated. So I > think it's not good to put exitArchiveRecovery() after > writeTimeLineHIstory(). > > This issue seems to exist in all supported version as far as I read > the code, although I don't test all of them yet. > > I've attached the draft patch to fix this issue. Regression test might > be required. Feedback and suggestion are very welcome.
What about moving the logic that removes RECO VERYXLOG and RECOVERYHISTORY from exitArchiveRecovery() and performing it just before/after RemoveNonParentXlogFiles()? Which looks simple. Regards, -- Fujii Masao