On Wed, Feb 21, 2018 at 5:27 PM, Tsunakawa, Takayuki
> From: Michael Paquier [mailto:mich...@paquier.xyz]
> It seems to me that you would reintroduce partially the problems that
>> 1d4a0ab1 has fixed. In short, if a crash happens in the code paths calling
>> RemoveXlogFile with durable = false before fsync'ing pg_wal, then a rename
>> has no guarantee to be durable, so you could finish again with a file that
>> as an old name, but new contents. A crucial thing which matters for a rename
> Hmm, you're right. Even during recovery, RemoveOldXlogFiles() can't skip
> fsyncing pg_wal/ because new WAL records streamed from the master are written
> to recycled WAL files.
> After all, it seems to me that we have to stand with the current patch which
> only handles RemoveNonParentXlogFiles().
But the approach that the patch uses would cause the performance problem
as Horiguchi-san pointed out upthread.
So, what about, as another approach, making the checkpointer instead of
the startup process call RemoveNonParentXlogFiles() when end-of-recovery
checkpoint is executed? ISTM that a recovery doesn't need to wait for
RemoveNonParentXlogFiles() to end. Instead, RemoveNonParentXlogFiles()
seems to have to complete before the checkpointer calls RemoveOldXlogFiles()
and creates .ready files for the "garbage" WAL files on the old timeline.
So it seems natual to leave that WAL recycle task to the checkpointer.