On Tue, Feb 22, 2022 at 09:37:11AM -0800, Nathan Bossart wrote: > In my testing, I found that when I killed the server just before unlink() > during WAL recyling, I ended up with links to the same file in pg_wal after > restarting. My latest test produced links to the same file for the current > WAL file and the next one. Maybe WAL recyling should use durable_rename() > instead of durable_rename_excl().
Here is an updated patch. -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
>From e080d493985cf0a203122c1e07952b0f765019e4 Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nathandboss...@gmail.com> Date: Tue, 22 Feb 2022 11:39:28 -0800 Subject: [PATCH v2 1/1] reduce archiving overhead --- src/backend/access/transam/xlog.c | 10 ++++++---- src/backend/postmaster/pgarch.c | 14 +++++++++++++- 2 files changed, 19 insertions(+), 5 deletions(-) diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c index 0d2bd7a357..2ad047052f 100644 --- a/src/backend/access/transam/xlog.c +++ b/src/backend/access/transam/xlog.c @@ -3334,13 +3334,15 @@ InstallXLogFileSegment(XLogSegNo *segno, char *tmppath, } /* - * Perform the rename using link if available, paranoidly trying to avoid - * overwriting an existing file (there shouldn't be one). + * Perform the rename. Ideally, we'd use link() and unlink() to avoid + * overwriting an existing file (there shouldn't be one). However, that + * approach opens up the possibility that pg_wal will contain multiple hard + * links to the same WAL file after a crash. */ - if (durable_rename_excl(tmppath, path, LOG) != 0) + if (durable_rename(tmppath, path, LOG) != 0) { LWLockRelease(ControlFileLock); - /* durable_rename_excl already emitted log message */ + /* durable_rename already emitted log message */ return false; } diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c index d916ed39a8..641297e9f5 100644 --- a/src/backend/postmaster/pgarch.c +++ b/src/backend/postmaster/pgarch.c @@ -746,7 +746,19 @@ pgarch_archiveDone(char *xlog) StatusFilePath(rlogready, xlog, ".ready"); StatusFilePath(rlogdone, xlog, ".done"); - (void) durable_rename(rlogready, rlogdone, WARNING); + + /* + * To avoid extra overhead, we don't durably rename the .ready file to + * .done. Archive commands and libraries must already gracefully handle + * attempts to re-archive files (e.g., if the server crashes just before + * this function is called), so it should be okay if the .ready file + * reappears after a crash. + */ + if (rename(rlogready, rlogdone) < 0) + ereport(WARNING, + (errcode_for_file_access(), + errmsg("could not rename file \"%s\" to \"%s\": %m", + rlogready, rlogdone))); } -- 2.25.1