At Thu, 7 Sep 2023 10:53:41 +0000, "Hayato Kuroda (Fujitsu)" <kuroda.hay...@fujitsu.com> wrote in > My first idea is that to move the checking part to above, but this may not > handle > the case the postmaster is still alive (now sure this is real issue). Do we > have to > add a new indicator which ensures the identity of processes for windows? > Please tell me how you feel.
It doesn't seem to work as expected. We still lose the relationship between the PID file and the launched postmaster. > > Now, I > > also recall that the processes spawned by pg_ctl on Windows make the > > status handling rather tricky to reason about.. > > Did you say about the below comment? Currently I have no idea to make > codes more proper, sorry. > > ``` > * On Windows, we may be checking the postmaster's parent > shell, but > * that's fine for this purpose. > ``` Ditching cmd.exe seems like a big hassle. So, on the flip side, I tried to identify the postmaster PID using the shell's PID, and it seem to work. The APIs used are avaiable from XP/2003 onwards. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
diff --git a/src/bin/pg_ctl/pg_ctl.c b/src/bin/pg_ctl/pg_ctl.c index 807d7023a9..d54cc6ab54 100644 --- a/src/bin/pg_ctl/pg_ctl.c +++ b/src/bin/pg_ctl/pg_ctl.c @@ -32,6 +32,9 @@ #ifdef WIN32 /* on Unix, we don't need libpq */ #include "pqexpbuffer.h" #endif +#ifdef WIN32 +#include <tlhelp32.h> +#endif typedef enum @@ -425,6 +428,91 @@ free_readfile(char **optlines) * start/test/stop routines */ +#ifdef WIN32 +/* + * Find the PID of the launched postmaster. + * + * On Windows, the cmd.exe doesn't support the exec command. As a result, we + * don't directly get the postmaster's PID. This function identifies the PID of + * the postmaster started by the child cmd.exe. + * + * Returns the postmaster's PID. If the shell is alive but the postmaster is + * missing, returns 0. Otherwise terminates this command with an error. + */ +DWORD +win32_find_postmaster_pid(int shell_pid) +{ + HANDLE hSnapshot; + DWORD flags; + DWORD p_pid = shell_pid; + PROCESSENTRY32 ppe; + DWORD pm_pid = 0; + bool shell_exists = false; + bool multiple_children = false; + DWORD last_error; + + /* Create a process snapshot */ + hSnapshot = CreateToolhelp32Snapshot(TH32CS_SNAPPROCESS, shell_pid); + if (hSnapshot == INVALID_HANDLE_VALUE) + { + write_stderr(_("%s: CreateToolhelp32Snapshot failed\n"), + progname); + return; + } + + /* start iterating on the snapshot */ + ppe.dwSize = sizeof(PROCESSENTRY32); + if (!Process32First(hSnapshot, &ppe)) + { + write_stderr(_("%s: Process32First failed: errcode=%08x\n"), + progname, GetLastError()); + exit(1); + } + + /* iterate over the snapshot */ + do + { + if (ppe.th32ProcessID == shell_pid) + shell_exists = true; + if (ppe.th32ParentProcessID == shell_pid) + { + if (pm_pid > 0) + multiple_children = true; + pm_pid = ppe.th32ProcessID; + } + } + while (Process32Next(hSnapshot, &ppe)); + + /* avoid multiple calls primary for clarity, not out of necessity */ + last_error = GetLastError(); + if (last_error != ERROR_NO_MORE_FILES) + { + write_stderr(_("%s: Process32Next failed: errcode=%08x\n"), + progname, last_error); + exit(1); + } + CloseHandle(hSnapshot); + + /* assuming the launching shell executes a single process */ + if (multiple_children) + { + write_stderr(_("%s: lancher shell executed multiple processes\n"), + progname); + exit(1); + } + + /* Check if the process is gone for any reason */ + if (!shell_exists) + { + /* Report the condition as server start failure */ + write_stderr(_("%s: launcher shell died\n"), progname); + exit(1); + } + + return pm_pid; +} +#endif + /* * Start the postmaster and return its PID. * @@ -609,7 +697,11 @@ wait_for_postmaster_start(pid_t pm_pid, bool do_checkpoint) /* File is complete enough for us, parse it */ pid_t pmpid; time_t pmstart; - +#ifndef WIN32 + pid_t wait_pid = pm_pid; +#else + pid_t wait_pid = win32_find_postmaster_pid(pm_pid); +#endif /* * Make sanity checks. If it's for the wrong PID, or the recorded * start time is before pg_ctl started, then either we are looking @@ -619,21 +711,15 @@ wait_for_postmaster_start(pid_t pm_pid, bool do_checkpoint) */ pmpid = atol(optlines[LOCK_FILE_LINE_PID - 1]); pmstart = atol(optlines[LOCK_FILE_LINE_START_TIME - 1]); - if (pmstart >= start_time - 2 && -#ifndef WIN32 - pmpid == pm_pid -#else - /* Windows can only reject standalone-backend PIDs */ - pmpid > 0 -#endif - ) + + if (pmstart >= start_time - 2 && pmpid == wait_pid) { /* * OK, seems to be a valid pidfile from our child. Check the * status line (this assumes a v10 or later server). */ char *pmstatus = optlines[LOCK_FILE_LINE_PM_STATUS - 1]; - + if (strcmp(pmstatus, PM_STATUS_READY) == 0 || strcmp(pmstatus, PM_STATUS_STANDBY) == 0) {