On Tue, Jan 17, 2023 at 11:24 AM Thomas Munro <thomas.mu...@gmail.com> wrote:
> Another idea would be to teach the latch infrastructure itself to
> magically swap latch events to position 0.  Latches are usually
> prioritised; it's only in this rare race case that they are not.

I liked that idea for a while, but I suspect it is not really possible
to solve the problem completely this way, because it won't work on
Windows (see below) and the race I described earlier is probably not
the only one.  I think it must also be possible for poll() to ignore a
signal that becomes pending just as the system call begins and return
a socket fd that has also just become ready, without waiting (thus not
causing EINTR).  Then the handler would run after we return to
userspace, we'd see only the socket event, and a later call would see
the latch event.

So I think we probably need something like the attached, which I was
originally trying to avoid.

Looking into all that made me notice a related problem on Windows.
There's an interesting difference between the implementation of
select() in src/backend/port/win32/socket.c and the Windows
implementation of WaitEventSetBlock() in latch.c.  The latch.c code
only reports one event at a time, in event array order, because that's
WaitForMultipleObjects()'s contract and we expose that fairly
directly.  The older socket.c code uses that only for wakeup, and then
it polls *all* sockets to be able to report more than one at a time.
I was careful to use a large array of output events to preserve the
existing round-robin servicing of multiple server sockets, but I see
now that that only works on Unix.  On Windows, I suspect that one
socket receiving a fast enough stream of new connections could prevent
a second socket from ever being serviced.  I think we might want to do
something about that.
From 8b08f138a3286503e339b546d758ef683514a929 Mon Sep 17 00:00:00 2001
From: Thomas Munro <thomas.mu...@gmail.com>
Date: Fri, 20 Jan 2023 05:50:39 +1300
Subject: [PATCH] Process pending postmaster work before connections.

Modify the new event loop code from commit 7389aad6 so that it checks
for work requested by signal handlers even if it doesn't see a latch
event yet.

This gives priority to shutdown and reload requests where the latch will
be reported later in the event array, or in a later call to
WaitEventSetWait(), due to scheduling details.  In particular, this
guarantees that a SIGHUP-then-connect sequence (as seen in
authentication tests) causes the postmaster to process the reload before
accepting the connection.  If the WaitEventSetWait() call saw the socket
as ready, and the reload signal was generated before the connection,
then the latest time the signal handler should be able to run is after
poll/epoll_wait/kevent returns but before we check the
pending_pm_reload_request flag.

Reported-by: Hou Zhijie <houzj.f...@fujitsu.com>
Reported-by: Tom Lane <t...@sss.pgh.pa.us>
Discussion: https://postgr.es/m/OS0PR01MB57163D3BF2AB42ECAA94E5C394C29%40OS0PR01MB5716.jpnprd01.prod.outlook.com

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index 9cedc1b9f0..cec3fe8dc5 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -1739,20 +1739,24 @@ ServerLoop(void)
 		for (int i = 0; i < nevents; i++)
 		{
 			if (events[i].events & WL_LATCH_SET)
-			{
 				ResetLatch(MyLatch);
 
-				/* Process work requested via signal handlers. */
-				if (pending_pm_shutdown_request)
-					process_pm_shutdown_request();
-				if (pending_pm_child_exit)
-					process_pm_child_exit();
-				if (pending_pm_reload_request)
-					process_pm_reload_request();
-				if (pending_pm_pmsignal)
-					process_pm_pmsignal();
-			}
-			else if (events[i].events & WL_SOCKET_ACCEPT)
+			/*
+			 * The following is done unconditionally, even if we didn't see
+			 * WL_LATCH_SET.  This gives high priority to shutdown and reload
+			 * requests where the latch happens to appear later in events[] or
+			 * will be reported by a later call to WaitEventSetWait().
+			 */
+			if (pending_pm_shutdown_request)
+				process_pm_shutdown_request();
+			if (pending_pm_child_exit)
+				process_pm_child_exit();
+			if (pending_pm_reload_request)
+				process_pm_reload_request();
+			if (pending_pm_pmsignal)
+				process_pm_pmsignal();
+
+			if (events[i].events & WL_SOCKET_ACCEPT)
 			{
 				Port	   *port;
 
-- 
2.35.1

Reply via email to