On Tue, Feb 28, 2023 at 08:36:03PM -0800, Nathan Bossart wrote:
> On Sun, Feb 26, 2023 at 12:12:27PM -0800, Andres Freund wrote:
>> Partially I just want something that can easily be searched for, that can 
>> have
>> comments attached to it documenting why what it is doing is safe.
>> 
>> It'd not be a huge amount of work to have a slow and restricted string
>> interpolation support, to make it easier to write messages. Converting floats
>> is probably too hard to do safely, and I'm not sure %m can safely be
>> supported. But basic things like %d would be pretty simple.
>> 
>> Basically a loop around the format string that directly writes to stderr 
>> using
>> write(), and only supports a signal safe subset of normal format strings.
> 
> Got it, thanks.  I will try to put something together along these lines,
> although I don't know if I'll pick up the interpolation support in this
> thread.

Here is an attempt at adding a signal safe function for writing to STDERR.

I didn't add support for format strings, but looking ahead, I think one
challenge will be avoiding va_start() and friends.  In any case, IMO format
string support probably deserves its own thread.

-- 
Nathan Bossart
Amazon Web Services: https://aws.amazon.com
>From b454604da126dc0894449ef4a3f49227f42a4df6 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandboss...@gmail.com>
Date: Thu, 23 Feb 2023 14:27:48 -0800
Subject: [PATCH v10 1/2] Move extra code out of the Pre/PostRestoreCommand()
 block.

If SIGTERM is received within this block, the startup process will
immediately proc_exit() in the signal handler, so it is inadvisable
to include any more code than is required in this section.  This
change moves the code recently added to this block (see 1b06d7b and
7fed801) to outside of the block.  This ensures that only system()
is called while proc_exit() might be called in the SIGTERM handler,
which is how this code worked from v8.4 to v14.
---
 src/backend/access/transam/xlogarchive.c | 15 +++++++++++----
 1 file changed, 11 insertions(+), 4 deletions(-)

diff --git a/src/backend/access/transam/xlogarchive.c b/src/backend/access/transam/xlogarchive.c
index fcc87ff44f..41684418b6 100644
--- a/src/backend/access/transam/xlogarchive.c
+++ b/src/backend/access/transam/xlogarchive.c
@@ -159,20 +159,27 @@ RestoreArchivedFile(char *path, const char *xlogfname,
 			(errmsg_internal("executing restore command \"%s\"",
 							 xlogRestoreCmd)));
 
+	fflush(NULL);
+	pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
+
 	/*
-	 * Check signals before restore command and reset afterwards.
+	 * PreRestoreCommand() informs the SIGTERM handler for the startup process
+	 * that it should proc_exit() right away.  This is done for the duration of
+	 * the system() call because there isn't a good way to break out while it
+	 * is executing.  Since we might call proc_exit() in a signal handler, it
+	 * is best to put any additional logic before or after the
+	 * PreRestoreCommand()/PostRestoreCommand() section.
 	 */
 	PreRestoreCommand();
 
 	/*
 	 * Copy xlog from archival storage to XLOGDIR
 	 */
-	fflush(NULL);
-	pgstat_report_wait_start(WAIT_EVENT_RESTORE_COMMAND);
 	rc = system(xlogRestoreCmd);
-	pgstat_report_wait_end();
 
 	PostRestoreCommand();
+
+	pgstat_report_wait_end();
 	pfree(xlogRestoreCmd);
 
 	if (rc == 0)
-- 
2.25.1

>From fb6957da01f11b75d1a1966f32b00e2e77c789a0 Mon Sep 17 00:00:00 2001
From: Nathan Bossart <nathandboss...@gmail.com>
Date: Tue, 14 Feb 2023 09:44:53 -0800
Subject: [PATCH v10 2/2] Don't proc_exit() in startup's SIGTERM handler if
 forked by system().

Instead, emit a message to STDERR and _exit() in this case.  This
change also adds assertions to proc_exit(), ProcKill(), and
AuxiliaryProcKill() to verify that these functions are not called
by a process forked by system(), etc.
---
 src/backend/postmaster/startup.c | 17 ++++++++++++++++-
 src/backend/storage/ipc/ipc.c    |  3 +++
 src/backend/storage/lmgr/proc.c  |  2 ++
 src/backend/utils/error/elog.c   | 28 ++++++++++++++++++++++++++++
 src/include/utils/elog.h         |  6 +-----
 5 files changed, 50 insertions(+), 6 deletions(-)

diff --git a/src/backend/postmaster/startup.c b/src/backend/postmaster/startup.c
index efc2580536..0e7de26bc2 100644
--- a/src/backend/postmaster/startup.c
+++ b/src/backend/postmaster/startup.c
@@ -19,6 +19,8 @@
  */
 #include "postgres.h"
 
+#include <unistd.h>
+
 #include "access/xlog.h"
 #include "access/xlogrecovery.h"
 #include "access/xlogutils.h"
@@ -121,7 +123,20 @@ StartupProcShutdownHandler(SIGNAL_ARGS)
 	int			save_errno = errno;
 
 	if (in_restore_command)
-		proc_exit(1);
+	{
+		/*
+		 * If we are in a child process (e.g., forked by system() in
+		 * RestoreArchivedFile()), we don't want to call any exit callbacks.
+		 * The parent will take care of that.
+		 */
+		if (MyProcPid == (int) getpid())
+			proc_exit(1);
+		else
+		{
+			write_stderr_signal_safe("StartupProcShutdownHandler() called in child process\n");
+			_exit(1);
+		}
+	}
 	else
 		shutdown_requested = true;
 	WakeupRecovery();
diff --git a/src/backend/storage/ipc/ipc.c b/src/backend/storage/ipc/ipc.c
index 1904d21795..d5097dc008 100644
--- a/src/backend/storage/ipc/ipc.c
+++ b/src/backend/storage/ipc/ipc.c
@@ -103,6 +103,9 @@ static int	on_proc_exit_index,
 void
 proc_exit(int code)
 {
+	/* proc_exit() is not safe in forked processes from system(), etc. */
+	Assert(MyProcPid == (int) getpid());
+
 	/* Clean up everything that must be cleaned up */
 	proc_exit_prepare(code);
 
diff --git a/src/backend/storage/lmgr/proc.c b/src/backend/storage/lmgr/proc.c
index 22b4278610..b9e2c3aafe 100644
--- a/src/backend/storage/lmgr/proc.c
+++ b/src/backend/storage/lmgr/proc.c
@@ -805,6 +805,7 @@ ProcKill(int code, Datum arg)
 	dlist_head *procgloballist;
 
 	Assert(MyProc != NULL);
+	Assert(MyProc->pid == (int) getpid());  /* not safe if forked by system(), etc. */
 
 	/* Make sure we're out of the sync rep lists */
 	SyncRepCleanupAtProcExit();
@@ -925,6 +926,7 @@ AuxiliaryProcKill(int code, Datum arg)
 	PGPROC	   *proc;
 
 	Assert(proctype >= 0 && proctype < NUM_AUXILIARY_PROCS);
+	Assert(MyProc->pid == (int) getpid());  /* not safe if forked by system(), etc. */
 
 	auxproc = &AuxiliaryProcs[proctype];
 
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 5898100acb..f9925c8d8e 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -3730,6 +3730,34 @@ write_stderr(const char *fmt,...)
 }
 
 
+/*
+ * Write a message to STDERR using only async-signal-safe functions.  This can
+ * be used to safely emit a message from a signal handler.
+ *
+ * TODO: It is likely possible to safely do a limited amount of string
+ * interpolation (e.g., %s and %d), but that is not presently supported.
+ */
+void
+write_stderr_signal_safe(const char *fmt)
+{
+	int			nwritten = 0;
+	int			ntotal = strlen(fmt);
+
+	while (nwritten < ntotal)
+	{
+		int			rc;
+
+		rc = write(STDERR_FILENO, fmt + nwritten, ntotal - nwritten);
+
+		/* Just give up on error.  There isn't much else we can do. */
+		if (rc == -1)
+			return;
+
+		nwritten += rc;
+	}
+}
+
+
 /*
  * Adjust the level of a recovery-related message per trace_recovery_messages.
  *
diff --git a/src/include/utils/elog.h b/src/include/utils/elog.h
index 4a9562fdaa..20f1b54c6a 100644
--- a/src/include/utils/elog.h
+++ b/src/include/utils/elog.h
@@ -529,11 +529,7 @@ extern void write_pipe_chunks(char *data, int len, int dest);
 extern void write_csvlog(ErrorData *edata);
 extern void write_jsonlog(ErrorData *edata);
 
-/*
- * Write errors to stderr (or by equal means when stderr is
- * not available). Used before ereport/elog can be used
- * safely (memory context, GUC load etc)
- */
 extern void write_stderr(const char *fmt,...) pg_attribute_printf(1, 2);
+extern void write_stderr_signal_safe(const char *fmt);
 
 #endif							/* ELOG_H */
-- 
2.25.1

Reply via email to