On Thu, Feb 24, 2022 at 09:55:53AM -0800, Nathan Bossart wrote:
> Yes.  I found that a crash at an unfortunate moment can produce multiple
> links to the same file in pg_wal, which seemed bad independent of archival.
> By fixing that (i.e., switching from durable_rename_excl() to
> durable_rename()), we not only avoid this problem, but we also avoid trying
> to archive a file the server is concurrently writing.  Then, after a crash,
> the WAL file to archive should either not exist (which is handled by the
> archiver) or contain the same contents as any preexisting archives.

I moved the fix for this to a new thread [0] since I think it should be
back-patched.  I've attached a new patch that only contains the part
related to reducing archiving overhead.

[0] https://www.postgresql.org/message-id/20220407182954.GA1231544%40nathanxps13

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From 6d8972412179fd3c82bb7e471966d4b7862bcbff Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandboss...@gmail.com>
Date: Thu, 7 Apr 2022 14:11:59 -0700
Subject: [PATCH v3 1/1] Reduce overhead of renaming archive status files.

Presently, archive status files are durably renamed from .ready to
.done to indicate that a file has been archived.  Persisting this
rename to disk accounts for a significant amount of the overhead
associated with archiving.  While durably renaming the file
prevents re-archiving in most cases, archive commands and libraries
must already gracefully handle attempts to re-archive the last
archived file after a crash (e.g., a crash immediately after
archive_command exits but before the server renames the status
file).

This change reduces the amount of overhead associated with
archiving by using rename() instead of durable_rename() to rename
the archive status files.  As a consequence, the server is more
likely to attempt to re-archive files after a crash, but as noted
above, archive commands and modules are already expected to handle
this.  It is also possible that the server will attempt to re-
archive files that have been removed or recycled, but the archiver
already handles this, too.

Author: Nathan Bossart
---
 src/backend/postmaster/pgarch.c | 14 +++++++++++++-
 1 file changed, 13 insertions(+), 1 deletion(-)

diff --git a/src/backend/postmaster/pgarch.c b/src/backend/postmaster/pgarch.c
index 0c8ca29f73..17a4151c01 100644
--- a/src/backend/postmaster/pgarch.c
+++ b/src/backend/postmaster/pgarch.c
@@ -746,7 +746,19 @@ pgarch_archiveDone(char *xlog)
 
 	StatusFilePath(rlogready, xlog, ".ready");
 	StatusFilePath(rlogdone, xlog, ".done");
-	(void) durable_rename(rlogready, rlogdone, WARNING);
+
+	/*
+	 * To avoid extra overhead, we don't durably rename the .ready file to
+	 * .done.  Archive commands and libraries must gracefully handle attempts
+	 * to re-archive files (e.g., if the server crashes just before this
+	 * function is called), so it should be okay if the .ready file reappears
+	 * after a crash.
+	 */
+	if (rename(rlogready, rlogdone) < 0)
+		ereport(WARNING,
+				(errcode_for_file_access(),
+				 errmsg("could not rename file \"%s\" to \"%s\": %m",
+						rlogready, rlogdone)));
 }
 
 
-- 
2.25.1

Reply via email to