On Tue, Jan 17, 2023 at 11:24 AM Thomas Munro <thomas.mu...@gmail.com> wrote: > Another idea would be to teach the latch infrastructure itself to > magically swap latch events to position 0. Latches are usually > prioritised; it's only in this rare race case that they are not.
I liked that idea for a while, but I suspect it is not really possible to solve the problem completely this way, because it won't work on Windows (see below) and the race I described earlier is probably not the only one. I think it must also be possible for poll() to ignore a signal that becomes pending just as the system call begins and return a socket fd that has also just become ready, without waiting (thus not causing EINTR). Then the handler would run after we return to userspace, we'd see only the socket event, and a later call would see the latch event. So I think we probably need something like the attached, which I was originally trying to avoid. Looking into all that made me notice a related problem on Windows. There's an interesting difference between the implementation of select() in src/backend/port/win32/socket.c and the Windows implementation of WaitEventSetBlock() in latch.c. The latch.c code only reports one event at a time, in event array order, because that's WaitForMultipleObjects()'s contract and we expose that fairly directly. The older socket.c code uses that only for wakeup, and then it polls *all* sockets to be able to report more than one at a time. I was careful to use a large array of output events to preserve the existing round-robin servicing of multiple server sockets, but I see now that that only works on Unix. On Windows, I suspect that one socket receiving a fast enough stream of new connections could prevent a second socket from ever being serviced. I think we might want to do something about that.
From 8b08f138a3286503e339b546d758ef683514a929 Mon Sep 17 00:00:00 2001 From: Thomas Munro <thomas.mu...@gmail.com> Date: Fri, 20 Jan 2023 05:50:39 +1300 Subject: [PATCH] Process pending postmaster work before connections. Modify the new event loop code from commit 7389aad6 so that it checks for work requested by signal handlers even if it doesn't see a latch event yet. This gives priority to shutdown and reload requests where the latch will be reported later in the event array, or in a later call to WaitEventSetWait(), due to scheduling details. In particular, this guarantees that a SIGHUP-then-connect sequence (as seen in authentication tests) causes the postmaster to process the reload before accepting the connection. If the WaitEventSetWait() call saw the socket as ready, and the reload signal was generated before the connection, then the latest time the signal handler should be able to run is after poll/epoll_wait/kevent returns but before we check the pending_pm_reload_request flag. Reported-by: Hou Zhijie <houzj.f...@fujitsu.com> Reported-by: Tom Lane <t...@sss.pgh.pa.us> Discussion: https://postgr.es/m/OS0PR01MB57163D3BF2AB42ECAA94E5C394C29%40OS0PR01MB5716.jpnprd01.prod.outlook.com diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c index 9cedc1b9f0..cec3fe8dc5 100644 --- a/src/backend/postmaster/postmaster.c +++ b/src/backend/postmaster/postmaster.c @@ -1739,20 +1739,24 @@ ServerLoop(void) for (int i = 0; i < nevents; i++) { if (events[i].events & WL_LATCH_SET) - { ResetLatch(MyLatch); - /* Process work requested via signal handlers. */ - if (pending_pm_shutdown_request) - process_pm_shutdown_request(); - if (pending_pm_child_exit) - process_pm_child_exit(); - if (pending_pm_reload_request) - process_pm_reload_request(); - if (pending_pm_pmsignal) - process_pm_pmsignal(); - } - else if (events[i].events & WL_SOCKET_ACCEPT) + /* + * The following is done unconditionally, even if we didn't see + * WL_LATCH_SET. This gives high priority to shutdown and reload + * requests where the latch happens to appear later in events[] or + * will be reported by a later call to WaitEventSetWait(). + */ + if (pending_pm_shutdown_request) + process_pm_shutdown_request(); + if (pending_pm_child_exit) + process_pm_child_exit(); + if (pending_pm_reload_request) + process_pm_reload_request(); + if (pending_pm_pmsignal) + process_pm_pmsignal(); + + if (events[i].events & WL_SOCKET_ACCEPT) { Port *port; -- 2.35.1