Re: cleanup patches for incremental backup

Nathan Bossart Wed, 24 Jan 2024 10:05:32 -0800

On Wed, Jan 24, 2024 at 12:46:16PM -0500, Robert Haas wrote:
> The "examining summary" line is generated based on the output of
> pg_available_wal_summaries(). The way that works is that the server
> calls readdir(), disassembles the filename into a TLI and two LSNs,
> and returns the result. Then, a fraction of a second later, the test
> script reassembles those components into a filename and finds the file
> missing. If the logic to translate between filenames and TLIs & LSNs
> were incorrect, the test would fail consistently. So the only
> explanation that seems to fit the facts is the file disappearing out
> from under us. But that really shouldn't happen. We do have code to
> remove such files in MaybeRemoveOldWalSummaries(), but it's only
> supposed to be nuking files more than 10 days old.
> 
> So I don't really have a theory here as to what could be happening. :-(


There might be an overflow risk in the cutoff time calculation, but I doubt
that's the root cause of these failures:

        /*
         * Files should only be removed if the last modification time precedes 
the
         * cutoff time we compute here.
         */
        cutoff_time = time(NULL) - 60 * wal_summary_keep_time;

Otherwise, I think we'll probably need to add some additional logging to
figure out what is happening...

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com

Re: cleanup patches for incremental backup

Reply via email to