On Tue, Jul 26, 2022 at 07:21:29AM -0400, David Steele wrote: > On 6/19/21 16:39, Noah Misch wrote: > >On Tue, Feb 02, 2021 at 07:14:16AM -0800, Noah Misch wrote: > >>Recycling and preallocation are wasteful during archive recovery, because > >>KeepFileRestoredFromArchive() unlinks every entry in its path. I propose to > >>fix the race by adding an XLogCtl flag indicating which regime currently > >>owns > >>the right to add long-term pg_wal directory entries. In the archive > >>recovery > >>regime, the checkpointer will not preallocate and will unlink old segments > >>instead of recycling them (like wal_recycle=off). XLogFileInit() will fail. > > > >Here's the implementation. Patches 1-4 suffice to stop the user-visible > >ERROR. Patch 5 avoids a spurious LOG-level message and wasted filesystem > >writes, and it provides some future-proofing. > > > >I was tempted to (but did not) just remove preallocation. Creating one file > >per checkpoint seems tiny relative to the max_wal_size=1GB default, so I > >expect it's hard to isolate any benefit. Under the old checkpoint_segments=3 > >default, a preallocated segment covered a respectable third of the next > >checkpoint. Before commit 63653f7 (2002), preallocation created more files. > > This also seems like it would fix the link issues we are seeing in [1]. > > I wonder if that would make it worth a back patch?
Perhaps. It's sad to have multiple people deep-diving into something fixed on HEAD. On the other hand, I'm not eager to spend risk-of-backpatch points on this. One alternative would be adding an errhint like "This is known to happen occasionally during archive recovery, where it is harmless." That has an unpolished look, but it's low-risk and may avoid deep-dive efforts. > [1] > https://www.postgresql.org/message-id/flat/CAKw-smBhLOGtRJTC5c%3DqKTPz8gz5%2BWPoVAXrHB6mY-1U4_N7-w%40mail.gmail.com