On Thu, Feb 24, 2022 at 09:55:53AM -0800, Nathan Bossart wrote: > Yes. I found that a crash at an unfortunate moment can produce multiple > links to the same file in pg_wal, which seemed bad independent of archival. > By fixing that (i.e., switching from durable_rename_excl() to > durable_rename()), we not only avoid this problem, but we also avoid trying > to archive a file the server is concurrently writing. Then, after a crash, > the WAL file to archive should either not exist (which is handled by the > archiver) or contain the same contents as any preexisting archives.
I moved the fix for this to a new thread [0] since I think it should be back-patched. I've attached a new patch that only contains the part related to reducing archiving overhead. [0] https://www.postgresql.org/message-id/20220407182954.GA1231544%40nathanxps13 -- Nathan Bossart Amazon Web Services: https://aws.amazon.com
>From 6d8972412179fd3c82bb7e471966d4b7862bcbff Mon Sep 17 00:00:00 2001 From: Nathan Bossart <nathandboss...@gmail.com> Date: Thu, 7 Apr 2022 14:11:59 -0700 Subject: [PATCH v3 1/1] Reduce overhead of renaming archive status files. Presently, archive status files are durably renamed from .ready to .done to indicate that a file has been archived. Persisting this rename to disk accounts for a significant amount of the overhead associated with archiving. While durably renaming the file prevents re-archiving in most cases, archive commands and libraries must already gracefully handle attempts to re-archive the last archived file after a crash (e.g., a crash immediately after archive_command exits but before the server renames the status file). This change reduces the amount of overhead associated with archiving by using rename() instead of durable_rename() to rename the archive status files. As a consequence, the server is more likely to attempt to re-archive files after a crash, but as noted above, archive commands and modules are already expected to handle this. It is also possible that the server will attempt to re- archive files that have been removed or recycled, but the archiver already handles this, too. Author: Nathan Bossart --- src/backend/postmaster/pgarch.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c index 0c8ca29f73..17a4151c01 100644 --- a/src/backend/postmaster/pgarch.c +++ b/src/backend/postmaster/pgarch.c @@ -746,7 +746,19 @@ pgarch_archiveDone(char *xlog) StatusFilePath(rlogready, xlog, ".ready"); StatusFilePath(rlogdone, xlog, ".done"); - (void) durable_rename(rlogready, rlogdone, WARNING); + + /* + * To avoid extra overhead, we don't durably rename the .ready file to + * .done. Archive commands and libraries must gracefully handle attempts + * to re-archive files (e.g., if the server crashes just before this + * function is called), so it should be okay if the .ready file reappears + * after a crash. + */ + if (rename(rlogready, rlogdone) < 0) + ereport(WARNING, + (errcode_for_file_access(), + errmsg("could not rename file \"%s\" to \"%s\": %m", + rlogready, rlogdone))); } -- 2.25.1