On 5/6/21, 1:01 PM, "Andres Freund" <and...@anarazel.de> wrote:
> If we leave history files and gaps in the .ready sequence aside for a
> second, we really only need an LSN or segment number describing the
> current "archive position". Then we can iterate over the segments
> between the "archive position" and the flush position (which we already
> know). Even if we needed to keep statting .ready/.done files (to handle
> gaps due to archive command mucking around with .ready/done), it'd still
> be a lot cheaper than what we do today.  It probably would even still be
> cheaper if we just statted all potentially relevant timeline history
> files all the time to send them first.

My apologies for chiming in so late to this thread, but a similar idea
crossed my mind while working on a bug where .ready files get created
too early [0].  Specifically, instead of maintaining a status file per
WAL segment, I was thinking we could narrow it down to a couple of
files to keep track of the boundaries we care about:

    1. earliest_done: the oldest segment that has been archived and
       can be recycled/removed
    2. latest_done: the newest segment that has been archived
    3. latest_ready: the newest segment that is ready for archival

This might complicate matters for backup utilities that currently
modify the .ready/.done files, but it would simplify this archive
status stuff quite a bit and eliminate the need to worry about the
directory scans in the first place.

Nathan

[0] 
https://www.postgresql.org/message-id/flat/cbddfa01-6e40-46bb-9f98-9340f4379...@amazon.com

Reply via email to