At Mon, 23 Oct 2023 08:57:19 +0000, "Hayato Kuroda (Fujitsu)" 
<kuroda.hay...@fujitsu.com> wrote in 
> > I agree with your suggestion.  Ultimately, if there's a possibility
> > for this to be committed, the message will be consolidated to "could
> > not start server".
> 
> Based on the suggestion, I tried to update the patch.
> A new argument is_valid is added for reporting callee. Also, reporting formats
> are adjusted based on other functions. How do you think?

An equivalent check is already done shortly afterward in the calling
function. Therefore, we can simply remove the code path for "launcher
shell died", and it will work the same way. Please find the attached.

Other error cases will fit to "shouldn't occur under normal
conditions" errors.

There is a possibility that the launcher shell terminates while
postmaster is running. Even in such a case, the server continue
working without any problems. I contemplated accomodating this case
but the effort required seemed disproportionate to the possibility.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
>From 530dc21be72e4e7f5ea199a9fceee00bbb88cf75 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota....@gmail.com>
Date: Tue, 24 Oct 2023 11:43:57 +0900
Subject: [PATCH v4 1/3] Disable autoruns for cmd.exe on Windows

On Windows, we use cmd.exe to launch the postmaster process for easy
redirection setup. However, cmd.exe may execute other programs at
startup due to autorun configurations. This patch adds /D flag to the
launcher cmd.exe command line to disable autorun settings written in
the registry.
---
 src/bin/pg_ctl/pg_ctl.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 807d7023a9..3ac2fcc004 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -553,11 +553,11 @@ start_postmaster(void)
 		else
 			close(fd);
 
-		cmd = psprintf("\"%s\" /C \"\"%s\" %s%s < \"%s\" >> \"%s\" 2>&1\"",
+		cmd = psprintf("\"%s\" /D /C \"\"%s\" %s%s < \"%s\" >> \"%s\" 2>&1\"",
 					   comspec, exec_path, pgdata_opt, post_opts, DEVNULL, log_file);
 	}
 	else
-		cmd = psprintf("\"%s\" /C \"\"%s\" %s%s < \"%s\" 2>&1\"",
+		cmd = psprintf("\"%s\" /D /C \"\"%s\" %s%s < \"%s\" 2>&1\"",
 					   comspec, exec_path, pgdata_opt, post_opts, DEVNULL);
 
 	if (!CreateRestrictedProcess(cmd, &pi, false))
-- 
2.39.3

>From 913a0fbe0937331fca6ea4099834ed3001714339 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota....@gmail.com>
Date: Tue, 24 Oct 2023 14:46:50 +0900
Subject: [PATCH v4 2/3] Improve pg_ctl postmaster process check on Windows

Currently pg_ctl on Windows does not verify that it actually executed
a postmaster process due to lack of process ID knowledge. This can
lead to false positives in cases where another pg_ctl instance starts
a different server simultaneously.
This patch adds the capability to identify the process ID of the
launched postmaster on Windows, similar to other OS versions, ensuring
more reliable detection of concurrent server startups.
---
 src/bin/pg_ctl/pg_ctl.c | 102 ++++++++++++++++++++++++++++++++++++----
 1 file changed, 93 insertions(+), 9 deletions(-)

diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c
index 3ac2fcc004..9c0168b075 100644
--- a/src/bin/pg_ctl/pg_ctl.c
+++ b/src/bin/pg_ctl/pg_ctl.c
@@ -132,6 +132,7 @@ static void adjust_data_dir(void);
 
 #ifdef WIN32
 #include <versionhelpers.h>
+#include <tlhelp32.h>
 static bool pgwin32_IsInstalled(SC_HANDLE);
 static char *pgwin32_CommandLine(bool);
 static void pgwin32_doRegister(void);
@@ -142,6 +143,7 @@ static void WINAPI pgwin32_ServiceMain(DWORD, LPTSTR *);
 static void pgwin32_doRunAsService(void);
 static int	CreateRestrictedProcess(char *cmd, PROCESS_INFORMATION *processInfo, bool as_service);
 static PTOKEN_PRIVILEGES GetPrivilegesToDelete(HANDLE hToken);
+static pid_t pgwin32_find_postmaster_pid(pid_t shell_pid);
 #endif
 
 static pid_t get_pgpid(bool is_status_request);
@@ -609,7 +611,11 @@ wait_for_postmaster_start(pid_t pm_pid, bool do_checkpoint)
 			/* File is complete enough for us, parse it */
 			pid_t		pmpid;
 			time_t		pmstart;
-
+#ifndef WIN32
+			pid_t		wait_pid = pm_pid;
+#else
+			pid_t		wait_pid = pgwin32_find_postmaster_pid(pm_pid);
+#endif
 			/*
 			 * Make sanity checks.  If it's for the wrong PID, or the recorded
 			 * start time is before pg_ctl started, then either we are looking
@@ -619,14 +625,8 @@ wait_for_postmaster_start(pid_t pm_pid, bool do_checkpoint)
 			 */
 			pmpid = atol(optlines[LOCK_FILE_LINE_PID - 1]);
 			pmstart = atol(optlines[LOCK_FILE_LINE_START_TIME - 1]);
-			if (pmstart >= start_time - 2 &&
-#ifndef WIN32
-				pmpid == pm_pid
-#else
-			/* Windows can only reject standalone-backend PIDs */
-				pmpid > 0
-#endif
-				)
+
+			if (pmstart >= start_time - 2 && pmpid == wait_pid)
 			{
 				/*
 				 * OK, seems to be a valid pidfile from our child.  Check the
@@ -1950,6 +1950,90 @@ GetPrivilegesToDelete(HANDLE hToken)
 
 	return tokenPrivs;
 }
+
+/*
+ * Find the PID of the launched postmaster.
+ *
+ * On Windows, the cmd.exe doesn't support the exec command. As a result, we
+ * don't directly get the postmaster's PID. This function identifies the PID of
+ * the postmaster started by the child cmd.exe.
+ *
+ * Returns the postmaster's PID. If the shell is alive but the postmaster is
+ * missing, returns 0. Otherwise terminates this command with an error.
+ *
+ * This function uses PID 0 as an invalid value, assuming the system idle
+ * process occupies it and it won't be a PID for a shell or postmaster.
+ */
+static pid_t
+pgwin32_find_postmaster_pid(pid_t shell_pid)
+{
+	HANDLE hSnapshot;
+	PROCESSENTRY32 ppe;
+	pid_t pm_pid = 0;			/* abusing 0 as an invalid value */
+	bool multiple_children = false;
+	DWORD last_error;
+
+	/* create a process snapshot */
+	hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, 0);
+	if (hSnapshot == INVALID_HANDLE_VALUE)
+	{
+		write_stderr(_("%s: CreateToolhelp32Snapshot failed\n"),
+					 progname);
+		exit(1);
+	}
+
+	/* start iterating on the snapshot */
+	ppe.dwSize = sizeof(PROCESSENTRY32);
+	if (!Process32First(hSnapshot, &ppe))
+	{
+		write_stderr(_("%s: Process32First failed: errcode=%08lx\n"),
+					 progname, GetLastError());
+		exit(1);
+	}
+
+	/*
+	 * Iterate over the snapshot
+	 *
+	 * Check for duplicate processes to ensure reliability.
+	 *
+	 * The launcher shell might start other cmd.exe instances or programs
+	 * besides postgres.exe. Veryfying the program file name is essential.
+	 *
+	 * The launcher shell process isn't checked in this function.  It will be
+	 * checked by the caller.
+	 */
+	do
+	{
+		if (ppe.th32ParentProcessID == shell_pid &&
+			strcmp("postgres.exe", ppe.szExeFile) == 0)
+		{
+			if (pm_pid != ppe.th32ProcessID && pm_pid != 0)
+				multiple_children = true;
+			pm_pid = ppe.th32ProcessID;
+		}
+	}
+	while (Process32Next(hSnapshot, &ppe));
+
+	/* avoid multiple calls primary for clarity, not out of necessity */
+	last_error = GetLastError();
+	if (last_error != ERROR_NO_MORE_FILES)
+	{
+		write_stderr(_("%s: Process32Next failed: errcode=%08lx\n"),
+					 progname, last_error);
+		exit(1);
+	}
+	CloseHandle(hSnapshot);
+
+	/* assuming the launching shell executes a single process */
+	if (multiple_children)
+	{
+		write_stderr(_("%s: multiple postmasters found\n"),
+					 progname);
+		exit(1);
+	}
+
+	return pm_pid;
+}
 #endif							/* WIN32 */
 
 static void
-- 
2.39.3

>From 77229eb99fbf67013a153828f8845db9550fccc5 Mon Sep 17 00:00:00 2001
From: Kyotaro Horiguchi <horikyota....@gmail.com>
Date: Tue, 24 Oct 2023 14:47:24 +0900
Subject: [PATCH v4 3/3] Remove short sleep from 001_start_stop.pl

Previous commits ensures reliable detection whether the postmaster has really
started or not, so the short sleep in a test is not needed anymore. It was
originally introduced by 6bcce258.
---
 src/bin/pg_ctl/t/001_start_stop.pl | 5 -----
 1 file changed, 5 deletions(-)

diff --git a/src/bin/pg_ctl/t/001_start_stop.pl b/src/bin/pg_ctl/t/001_start_stop.pl
index f019fe1703..590a82338f 100644
--- a/src/bin/pg_ctl/t/001_start_stop.pl
+++ b/src/bin/pg_ctl/t/001_start_stop.pl
@@ -46,11 +46,6 @@ my $ctlcmd = [
 ];
 command_like($ctlcmd, qr/done.*server started/s, 'pg_ctl start');
 
-# sleep here is because Windows builds can't check postmaster.pid exactly,
-# so they may mistake a pre-existing postmaster.pid for one created by the
-# postmaster they start.  Waiting more than the 2 seconds slop time allowed
-# by wait_for_postmaster() prevents that mistake.
-sleep 3 if ($windows_os);
 command_fails([ 'pg_ctl', 'start', '-D', "$tempdir/data" ],
 	'second pg_ctl start fails');
 command_ok([ 'pg_ctl', 'stop', '-D', "$tempdir/data" ], 'pg_ctl stop');
-- 
2.39.3

Reply via email to