From: Michael Paquier [mailto:mich...@paquier.xyz]
> On Wed, Mar 07, 2018 at 12:55:43AM +0900, Fujii Masao wrote:
> > So, what about, as another approach, making the checkpointer instead
> > of the startup process call RemoveNonParentXlogFiles() when
> > end-of-recovery checkpoint is executed? ISTM that a recovery doesn't
> > need to wait for
> > RemoveNonParentXlogFiles() to end. Instead, RemoveNonParentXlogFiles()
> > seems to have to complete before the checkpointer calls
> > RemoveOldXlogFiles() and creates .ready files for the "garbage" WAL files
> on the old timeline.
> > So it seems natual to leave that WAL recycle task to the checkpointer.
> Couldn't that impact the I/O performance at the end of recovery until the
> first post-recovery checkpoint is completed? Let's not forget that since
> 9.3 the end-of-recovery checkpoint is not triggered immediately, so there
> could be a delay. If WAL segments of the past timeline are recycled without
> waiting for this first checkpoint to happen then there is no need to create
> new, zero-emptied, segments post-recovery, which can count as well.
Good point. I understood you referred to PreallocXlogFiles(), which may create
one new WAL file if RemoveNonParentXlogFiles() is not called or does not
recycle WAL files in the old timeline.
The best hack (or a compromise/kludge?) seems to be:
1. Modify durable_xx() functions so that they don't fsync directory hanges when
enableFsync is false.
2. RemoveNonParentXlogFiles() sets enableFsync to false before the while loop,
restores the original value of it after the while loop, and fsync pg_wal/ just
What do you think?