Re: Optimising latch signals
On Tue, Mar 9, 2021 at 1:09 PM Thomas Munro wrote: > On Tue, Mar 9, 2021 at 12:20 PM Alvaro Herrera > wrote: > > Hi, I don't know if you realized but we have two new Illumos members > > now (haddock and hake), and they're both failing initdb on signalfd() > > problems. > I'll wait a short time while he tries to see if that can be fixed (I They're green now. For the record: https://www.illumos.org/issues/13613
Re: Optimising latch signals
On Tue, Mar 9, 2021 at 12:20 PM Alvaro Herrera wrote: > On 2021-Mar-03, Thomas Munro wrote: > > On Mon, Mar 1, 2021 at 2:29 PM Thomas Munro wrote: > > > Time to watch the buildfarm to find out if my speculation about > > > illumos is correct... > > > > I just heard that damselfly's host has been decommissioned with no > > immediate plan for a replacement. That was the last of the > > Solaris-family animals testing master. It may be some time before I > > find out if my assumptions broke something on that OS... > > Hi, I don't know if you realized but we have two new Illumos members > now (haddock and hake), and they're both failing initdb on signalfd() > problems. Ah, cool. I'd been discussing this with their owner, who saw my message and wanted to provide replacements. Nice to see these start up even though I don't love the colour of their first results. In off-list emails, we got as far as determining that signalfd() fails on illumos when running inside a zone (= container), because /dev/signalfd is somehow not present. Apparently it works when running on the main host. Tracing revealed that it's trying to open that device and getting ENOENT here: running bootstrap script ... FATAL: XX000: signalfd() failed LOCATION: InitializeLatchSupport, latch.c:279 I'll wait a short time while he tries to see if that can be fixed (I have no clue if it's a configuration problem in some kind of zone creation scripts, or a bug, or what). If not, my fallback plan will be to change it to default to WAIT_USE_POLL on that OS until it can be fixed.
Re: Optimising latch signals
On 2021-Mar-03, Thomas Munro wrote: > On Mon, Mar 1, 2021 at 2:29 PM Thomas Munro wrote: > > Time to watch the buildfarm to find out if my speculation about > > illumos is correct... > > I just heard that damselfly's host has been decommissioned with no > immediate plan for a replacement. That was the last of the > Solaris-family animals testing master. It may be some time before I > find out if my assumptions broke something on that OS... Hi, I don't know if you realized but we have two new Illumos members now (haddock and hake), and they're both failing initdb on signalfd() problems. -- Álvaro Herrera Valdivia, Chile
Re: Optimising latch signals
On Mon, Mar 1, 2021 at 2:29 PM Thomas Munro wrote: > Time to watch the buildfarm to find out if my speculation about > illumos is correct... I just heard that damselfly's host has been decommissioned with no immediate plan for a replacement. That was the last of the Solaris-family animals testing master. It may be some time before I find out if my assumptions broke something on that OS...
Re: Optimising latch signals
On Sat, Feb 27, 2021 at 12:04 AM Thomas Munro wrote: > I'm planning to commit this soon if there are no objections. Pushed, with the addition of an SFD_CLOEXEC flag for the signalfd. Time to watch the buildfarm to find out if my speculation about illumos is correct...
Re: Optimising latch signals
Here's a new version with two small changes from Andres: 1. Reorder InitPostmasterChild() slightly to avoid hanging on EXEC_BACKEND builds. 2. Revert v2's use of raise(x) instead of kill(MyProcPid, x); glibc manages to generate 5 syscalls for raise(). I'm planning to commit this soon if there are no objections. From acacbf06aa54b4b139553b08d7f8511b0aaa0331 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 8 Aug 2020 15:08:09 +1200 Subject: [PATCH v5 1/5] Optimize latches to send fewer signals. Don't send signals to processes that aren't currently sleeping. Author: Andres Freund Discussion: https://postgr.es/m/ca+hukgjjxpdpzbe0a3hyuywbvazuc89yx3jk9rfzgfv_khu...@mail.gmail.com --- src/backend/storage/ipc/latch.c | 24 src/include/storage/latch.h | 1 + 2 files changed, 25 insertions(+) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index f2d005eea0..04aed96207 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -274,6 +274,7 @@ void InitLatch(Latch *latch) { latch->is_set = false; + latch->maybe_sleeping = false; latch->owner_pid = MyProcPid; latch->is_shared = false; @@ -321,6 +322,7 @@ InitSharedLatch(Latch *latch) #endif latch->is_set = false; + latch->maybe_sleeping = false; latch->owner_pid = 0; latch->is_shared = true; } @@ -523,6 +525,10 @@ SetLatch(Latch *latch) latch->is_set = true; + pg_memory_barrier(); + if (!latch->maybe_sleeping) + return; + #ifndef WIN32 /* @@ -589,6 +595,7 @@ ResetLatch(Latch *latch) { /* Only the owner should reset the latch */ Assert(latch->owner_pid == MyProcPid); + Assert(latch->maybe_sleeping == false); latch->is_set = false; @@ -1289,6 +1296,14 @@ WaitEventSetWait(WaitEventSet *set, long timeout, * ordering, so that we cannot miss seeing is_set if a notification * has already been queued. */ + if (set->latch && !set->latch->is_set) + { + /* about to sleep on a latch */ + set->latch->maybe_sleeping = true; + pg_memory_barrier(); + /* and recheck */ + } + if (set->latch && set->latch->is_set) { occurred_events->fd = PGINVALID_SOCKET; @@ -1299,6 +1314,9 @@ WaitEventSetWait(WaitEventSet *set, long timeout, occurred_events++; returned_events++; + /* could have been set above */ + set->latch->maybe_sleeping = false; + break; } @@ -1310,6 +1328,12 @@ WaitEventSetWait(WaitEventSet *set, long timeout, rc = WaitEventSetWaitBlock(set, cur_timeout, occurred_events, nevents); + if (set->latch) + { + Assert(set->latch->maybe_sleeping); + set->latch->maybe_sleeping = false; + } + if (rc == -1) break;/* timeout occurred */ else diff --git a/src/include/storage/latch.h b/src/include/storage/latch.h index 1468f30a8e..393591be03 100644 --- a/src/include/storage/latch.h +++ b/src/include/storage/latch.h @@ -110,6 +110,7 @@ typedef struct Latch { sig_atomic_t is_set; + sig_atomic_t maybe_sleeping; bool is_shared; int owner_pid; #ifdef WIN32 -- 2.30.0 From af31f00b4df0b65f177696891180345da8dabaee Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 26 Nov 2020 10:18:29 +1300 Subject: [PATCH v5 2/5] Use SIGURG rather than SIGUSR1 for latches. Traditionally, SIGUSR1 has been overloaded for ad-hoc signals, procsignal.c signals and latch.c wakeups. Move that last use over to a new dedicated signal. SIGURG is normally used to report out-of-band socket data, but PostgreSQL doesn't use that. The signal handler is now installed in all postmaster children by InitializeLatchSupport(). Those wishing to disconnect from it should call ShutdownLatchSupport() which will set it back to SIG_IGN. Future patches will use this separation to avoid the need for a signal handler on some operating systems. Discussion: https://postgr.es/m/ca+hukgjjxpdpzbe0a3hyuywbvazuc89yx3jk9rfzgfv_khu...@mail.gmail.com --- src/backend/postmaster/bgworker.c| 19 ++- src/backend/postmaster/postmaster.c | 4 +++ src/backend/storage/ipc/latch.c | 50 ++-- src/backend/storage/ipc/procsignal.c | 2 -- src/include/storage/latch.h | 11 +- 5 files changed, 40 insertions(+), 46 deletions(-) diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c index 6fdea3fc2d..bbbc09b0b5 100644 --- a/src/backend/postmaster/bgworker.c +++ b/src/backend/postmaster/bgworker.c @@ -713,22 +713,6 @@ bgworker_die(SIGNAL_ARGS) MyBgworkerEntry->bgw_type))); } -/* - * Standard SIGUSR1 handler for unconnected workers - * - * Here, we want to make sure an unconnected worker will at least heed - * latch activity. - */ -static void -bgworker_sigusr1_handler(SIGNAL_ARGS) -{ - int save_errno = errno; - - latch_sigusr1_handler(); - - errno = save_errno; -} - /* * Start a new background worker * @@ -759,6 +743,7 @@ StartBackgroundWorker(void) */ if ((worker->bgw_flags & BGWORKER_SHMEM_ACCESS) == 0)
Re: Optimising latch signals
On Thu, Nov 19, 2020 at 4:49 PM Thomas Munro wrote: > I'll add this to the next commitfest. Let's see if this version fixes the Windows compile error and warning reported by cfbot. From 3eb542891a11d39047b28f6f33ae4e3d25bdd510 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 8 Aug 2020 15:08:09 +1200 Subject: [PATCH v4 1/5] Optimize latches to send fewer signals. Don't send signals to processes that aren't currently sleeping. Author: Andres Freund Discussion: https://postgr.es/m/ca+hukgjjxpdpzbe0a3hyuywbvazuc89yx3jk9rfzgfv_khu...@mail.gmail.com --- src/backend/storage/ipc/latch.c | 24 src/include/storage/latch.h | 1 + 2 files changed, 25 insertions(+) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 24afc47d51..017b5c0c0f 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -274,6 +274,7 @@ void InitLatch(Latch *latch) { latch->is_set = false; + latch->maybe_sleeping = false; latch->owner_pid = MyProcPid; latch->is_shared = false; @@ -321,6 +322,7 @@ InitSharedLatch(Latch *latch) #endif latch->is_set = false; + latch->maybe_sleeping = false; latch->owner_pid = 0; latch->is_shared = true; } @@ -523,6 +525,10 @@ SetLatch(Latch *latch) latch->is_set = true; + pg_memory_barrier(); + if (!latch->maybe_sleeping) + return; + #ifndef WIN32 /* @@ -589,6 +595,7 @@ ResetLatch(Latch *latch) { /* Only the owner should reset the latch */ Assert(latch->owner_pid == MyProcPid); + Assert(latch->maybe_sleeping == false); latch->is_set = false; @@ -1289,6 +1296,14 @@ WaitEventSetWait(WaitEventSet *set, long timeout, * ordering, so that we cannot miss seeing is_set if a notification * has already been queued. */ + if (set->latch && !set->latch->is_set) + { + /* about to sleep on a latch */ + set->latch->maybe_sleeping = true; + pg_memory_barrier(); + /* and recheck */ + } + if (set->latch && set->latch->is_set) { occurred_events->fd = PGINVALID_SOCKET; @@ -1299,6 +1314,9 @@ WaitEventSetWait(WaitEventSet *set, long timeout, occurred_events++; returned_events++; + /* could have been set above */ + set->latch->maybe_sleeping = false; + break; } @@ -1310,6 +1328,12 @@ WaitEventSetWait(WaitEventSet *set, long timeout, rc = WaitEventSetWaitBlock(set, cur_timeout, occurred_events, nevents); + if (set->latch) + { + Assert(set->latch->maybe_sleeping); + set->latch->maybe_sleeping = false; + } + if (rc == -1) break;/* timeout occurred */ else diff --git a/src/include/storage/latch.h b/src/include/storage/latch.h index 7c742021fb..d0da7e5d31 100644 --- a/src/include/storage/latch.h +++ b/src/include/storage/latch.h @@ -110,6 +110,7 @@ typedef struct Latch { sig_atomic_t is_set; + sig_atomic_t maybe_sleeping; bool is_shared; int owner_pid; #ifdef WIN32 -- 2.20.1 From e0709315a443c76058a5a61e59bb54e248981d5d Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 26 Nov 2020 10:18:29 +1300 Subject: [PATCH v4 2/5] Use SIGURG rather than SIGUSR1 for latches. Traditionally, SIGUSR1 has been overloaded for ad-hoc signals, procsignal.c signals and latch.c wakeups. Move that last use over to a new dedicated signal. SIGURG is normally used to report out-of-band socket data, but PostgreSQL doesn't use that. The signal handler is now installed in all postmaster children by InitializeLatchSupport(). Those wishing to disconnect from it should call ShutdownLatchSupport() which will set it back to SIG_IGN. Future patches will use this separation to avoid the need for a signal handler on some operating systems. Discussion: https://postgr.es/m/ca+hukgjjxpdpzbe0a3hyuywbvazuc89yx3jk9rfzgfv_khu...@mail.gmail.com --- src/backend/postmaster/bgworker.c| 19 ++- src/backend/postmaster/postmaster.c | 4 +++ src/backend/storage/ipc/latch.c | 50 ++-- src/backend/storage/ipc/procsignal.c | 2 -- src/include/storage/latch.h | 11 +- 5 files changed, 40 insertions(+), 46 deletions(-) diff --git a/src/backend/postmaster/bgworker.c b/src/backend/postmaster/bgworker.c index 5a9a0e3435..9241b4edf4 100644 --- a/src/backend/postmaster/bgworker.c +++ b/src/backend/postmaster/bgworker.c @@ -654,22 +654,6 @@ bgworker_die(SIGNAL_ARGS) MyBgworkerEntry->bgw_type))); } -/* - * Standard SIGUSR1 handler for unconnected workers - * - * Here, we want to make sure an unconnected worker will at least heed - * latch activity. - */ -static void -bgworker_sigusr1_handler(SIGNAL_ARGS) -{ - int save_errno = errno; - - latch_sigusr1_handler(); - - errno = save_errno; -} - /* * Start a new background worker * @@ -700,6 +684,7 @@ StartBackgroundWorker(void) */ if ((worker->bgw_flags & BGWORKER_SHMEM_ACCESS) == 0) { + ShutdownLatchSupport(); dsm_detach_all(); PGSharedMemoryDetach(); } @@ -727,7 +712,7 @@ StartBackgroundWorker(void)
Re: Optimising latch signals
On Fri, Nov 13, 2020 at 12:42 PM Thomas Munro wrote: > 1. It's a bit clunky that pqinitmask() takes a new argument to say > whether SIGURG should be blocked; that's because the knowledge of > which latch implementation we're using is private to latch.c, and only > the epoll version needs to block it. I wonder how to make that > tidier. I found, I think, a better way: now InitializeLatchSupport() is in charge of managing the signal handler and modifying the signal mask. > 3. Maybe it's strange to continue to use overloaded SIGUSR1 on > non-epoll, non-kqueue systems; perhaps we should use SIGURG > everywhere. Fixed. > The improvement isn't quite as good on Linux, because as far as I can > tell you still have to have an (empty) signal handler installed and it > runs (can we find a way to avoid that?), but you still get to skip all > the pipe manipulation. I received an off-list clue that we could use a signalfd, which I'd discounted before because it still has to be drained; in fact the overheads saved outweigh that so this seems like a better solution, and I'm reliably informed that in a future WAIT_USE_IOURING mode it should be possible to get rid of the read too, so it seems like a good direction to go in. I'll add this to the next commitfest. From cfbcb0c1339be4e8cece232aa512cf89c539fbe4 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 8 Aug 2020 15:08:09 +1200 Subject: [PATCH v3 1/4] Optimize latches to send fewer signals. Don't send signals to processes that aren't currently sleeping. Author: Andres Freund Discussion: https://postgr.es/m/ca+hukgjjxpdpzbe0a3hyuywbvazuc89yx3jk9rfzgfv_khu...@mail.gmail.com --- src/backend/storage/ipc/latch.c | 26 ++ src/include/storage/latch.h | 1 + 2 files changed, 27 insertions(+) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 24d44c982d..f4768b79c2 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -274,6 +274,7 @@ void InitLatch(Latch *latch) { latch->is_set = false; + latch->maybe_sleeping = false; latch->owner_pid = MyProcPid; latch->is_shared = false; @@ -321,6 +322,7 @@ InitSharedLatch(Latch *latch) #endif latch->is_set = false; + latch->maybe_sleeping = false; latch->owner_pid = 0; latch->is_shared = true; } @@ -523,6 +525,10 @@ SetLatch(Latch *latch) latch->is_set = true; + pg_memory_barrier(); + if (!latch->maybe_sleeping) + return; + #ifndef WIN32 /* @@ -589,6 +595,7 @@ ResetLatch(Latch *latch) { /* Only the owner should reset the latch */ Assert(latch->owner_pid == MyProcPid); + Assert(latch->maybe_sleeping == false); latch->is_set = false; @@ -1289,6 +1296,14 @@ WaitEventSetWait(WaitEventSet *set, long timeout, * ordering, so that we cannot miss seeing is_set if a notification * has already been queued. */ + if (set->latch && !set->latch->is_set) + { + /* about to sleep on a latch */ + set->latch->maybe_sleeping = true; + pg_memory_barrier(); + /* and recheck */ + } + if (set->latch && set->latch->is_set) { occurred_events->fd = PGINVALID_SOCKET; @@ -1299,6 +1314,9 @@ WaitEventSetWait(WaitEventSet *set, long timeout, occurred_events++; returned_events++; + /* could have been set above */ + set->latch->maybe_sleeping = false; + break; } @@ -1310,6 +1328,12 @@ WaitEventSetWait(WaitEventSet *set, long timeout, rc = WaitEventSetWaitBlock(set, cur_timeout, occurred_events, nevents); + if (set->latch) + { + Assert(set->latch->maybe_sleeping); + set->latch->maybe_sleeping = false; + } + if (rc == -1) break;/* timeout occurred */ else @@ -1399,6 +1423,8 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, if (cur_event->events == WL_LATCH_SET && cur_epoll_event->events & (EPOLLIN | EPOLLERR | EPOLLHUP)) { + Assert(set->latch); + /* There's data in the self-pipe, clear it. */ drainSelfPipe(); diff --git a/src/include/storage/latch.h b/src/include/storage/latch.h index 7c742021fb..d0da7e5d31 100644 --- a/src/include/storage/latch.h +++ b/src/include/storage/latch.h @@ -110,6 +110,7 @@ typedef struct Latch { sig_atomic_t is_set; + sig_atomic_t maybe_sleeping; bool is_shared; int owner_pid; #ifdef WIN32 -- 2.20.1 From 6ce0cb44ae8d70beff50945775074d75b944fedb Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Thu, 19 Nov 2020 12:38:29 +1300 Subject: [PATCH v3 2/4] Use SIGURG rather than SIGUSR1 for latches. Traditionally, SIGUSR1 has been overloaded for ad-hoc signals, procsignal.c signals and latch.c wakeups. Move that last use over to a new dedicated signal. SIGURG reports out-of-band socket data, but PostgreSQL doesn't use that. The signal handler is now installed in all postmaster children by InitializeLatchSupport(). Those wishing to disconnect from it should call ShutdownLatchSupport() which will set it back to SIG_IGN. Future patches will use this
Re: Optimising latch signals
On Sun, Aug 9, 2020 at 11:48 PM Thomas Munro wrote: > Here are some more experimental patches to reduce system calls. > > 0001 skips sending signals when the recipient definitely isn't > waiting, using a new flag-and-memory-barrier dance. This seems to > skip around 12% of the kill() calls for "make check", and probably > helps with some replication configurations that do a lot of > signalling. Patch by Andres (with a small memory barrier adjustment > by me). > > 0002 gets rid of the latch self-pipe on Linux systems. > > 0003 does the same on *BSD/macOS systems. Here's a rebase over the recent signal handler/mask reorganisation. Some thoughts, on looking at this again after a while: 1. It's a bit clunky that pqinitmask() takes a new argument to say whether SIGURG should be blocked; that's because the knowledge of which latch implementation we're using is private to latch.c, and only the epoll version needs to block it. I wonder how to make that tidier. 2. It's a bit weird to have UnBlockSig (SIGURG remains blocked for epoll builds) and UnBlockAllSig (SIGURG is also unblocked). Maybe the naming is confusing. 3. Maybe it's strange to continue to use overloaded SIGUSR1 on non-epoll, non-kqueue systems; perhaps we should use SIGURG everywhere. 4. As a nano-optimisation, SetLatch() on a latch the current process owns might as well use raise(SIGURG) rather than kill(). This is necessary to close races when SetLatch(MyLatch) runs in a signal handler. In other words, although this patch uses signal blocking to close the race when other processes call SetLatch() and send us SIGURG, there's still a race if, say, SIGINT is sent to the checkpointer and it sets its own latch from its SIGINT handler function; in the user context it may be in WaitEventSetWait() having just seen latch->is_set == false, and now be about to enter epoll_pwait()/kevent() after the signal handler returns, so we need to give it a reason not to go to sleep. By way of motivation for removing the self-pipe, and where possible also the signal handler, here is a trace of the WAL writer handling three requests to write data, on a FreeBSD system, with the patch: kevent(9,0x0,0,{ SIGURG,... },1,{ 0.2 }) = 1 (0x1) pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0\0\M-/\^\"...,8192,0xaf) = 8192 (0x2000) kevent(9,0x0,0,{ SIGURG,... },1,{ 0.2 }) = 1 (0x1) pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0 \M-/\^\\0"...,8192,0xaf2000) = 8192 (0x2000) kevent(9,0x0,0,{ SIGURG,... },1,{ 0.2 }) = 1 (0x1) pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0`\M-/\^\\0"...,8192,0xaf6000) = 8192 (0x2000) Here is the same thing on unpatched master: kevent(11,0x0,0,0x801c195b0,1,{ 0.2 }) ERR#4 'Interrupted system call' SIGNAL 30 (SIGUSR1) code=SI_USER pid=66575 uid=1001 sigprocmask(SIG_SETMASK,{ SIGUSR1 },0x0) = 0 (0x0) write(10,"\0",1) = 1 (0x1) sigreturn(0x7fffc880)EJUSTRETURN pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0`\M-w)\0\0"...,8192,0xf76000) = 8192 (0x2000) kevent(11,0x0,0,{ 9,EVFILT_READ,0x0,0,0x1,0x801c19580 },1,{ 0.2 }) = 1 (0x1) read(9,"\0",16) = 1 (0x1) kevent(11,0x0,0,0x801c195b0,1,{ 0.2 }) ERR#4 'Interrupted system call' SIGNAL 30 (SIGUSR1) code=SI_USER pid=66575 uid=1001 sigprocmask(SIG_SETMASK,{ SIGUSR1 },0x0) = 0 (0x0) write(10,"\0",1) = 1 (0x1) sigreturn(0x7fffc880)EJUSTRETURN pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0 \M-y)\0\0"...,8192,0xf92000) = 8192 (0x2000) kevent(11,0x0,0,{ 9,EVFILT_READ,0x0,0,0x1,0x801c19580 },1,{ 0.2 }) = 1 (0x1) SIGNAL 30 (SIGUSR1) code=SI_USER pid=66575 uid=1001 sigprocmask(SIG_SETMASK,{ SIGUSR1 },0x0) = 0 (0x0) write(10,"\0",1) = 1 (0x1) sigreturn(0x7fffc880)EJUSTRETURN read(9,"\0\0",16)= 2 (0x2) kevent(11,0x0,0,0x801c195b0,1,{ 0.2 }) ERR#4 'Interrupted system call' SIGNAL 30 (SIGUSR1) code=SI_USER pid=66575 uid=1001 sigprocmask(SIG_SETMASK,{ SIGUSR1 },0x0) = 0 (0x0) write(10,"\0",1) = 1 (0x1) sigreturn(0x7fffc880)EJUSTRETURN pwrite(4,"\b\M-Q\^E\0\^A\0\0\0\0\0\M-z)\0"...,8192,0xfa) = 8192 (0x2000) kevent(11,0x0,0,{ 9,EVFILT_READ,0x0,0,0x1,0x801c19580 },1,{ 0.2 }) = 1 (0x1) read(9,"\0",16) = 1 (0x1) The improvement isn't quite as good on Linux, because as far as I can tell you still have to have an (empty) signal handler installed and it runs (can we find a way to avoid that?), but you still get to skip all the pipe manipulation. From 8dd1ae41ac3f2c8b378dfad64bbfe94eb41d39ab Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 8 Aug 2020 15:08:09 +1200 Subject: [PATCH v2 1/4] Optimize latches to send fewer signals. Don't send signals to processes that aren't currently sleeping. Author: Andres Freund Discussion:
Optimising latch signals
Hi hackers, Here are some more experimental patches to reduce system calls. 0001 skips sending signals when the recipient definitely isn't waiting, using a new flag-and-memory-barrier dance. This seems to skip around 12% of the kill() calls for "make check", and probably helps with some replication configurations that do a lot of signalling. Patch by Andres (with a small memory barrier adjustment by me). 0002 gets rid of the latch self-pipe on Linux systems. 0003 does the same on *BSD/macOS systems. The idea for 0002 and 0003 is to use a new dedicated signal just for latch wakeups, and keep it blocked (Linux) or ignored (BSD), except while waiting. There may be other ways to achieve this without bringing in a new signal, but it seemed important to leave SIGUSR1 unblocked for procsignals, and hard to figure out how to multiplex with existing SIGUSR2 users, so for the first attempt at prototyping this I arbitrarily chose SIGURG. From a45e9edaa315e62a0f823dbd55b42950fc8d5df3 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 8 Aug 2020 15:08:09 +1200 Subject: [PATCH 1/3] Optimize latches to send fewer signals. Author: Andres Freund --- src/backend/storage/ipc/latch.c | 26 ++ src/include/storage/latch.h | 1 + 2 files changed, 27 insertions(+) diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c index 4153cc8557..37528f29b5 100644 --- a/src/backend/storage/ipc/latch.c +++ b/src/backend/storage/ipc/latch.c @@ -274,6 +274,7 @@ void InitLatch(Latch *latch) { latch->is_set = false; + latch->maybe_sleeping = false; latch->owner_pid = MyProcPid; latch->is_shared = false; @@ -321,6 +322,7 @@ InitSharedLatch(Latch *latch) #endif latch->is_set = false; + latch->maybe_sleeping = false; latch->owner_pid = 0; latch->is_shared = true; } @@ -523,6 +525,10 @@ SetLatch(Latch *latch) latch->is_set = true; + pg_memory_barrier(); + if (!latch->maybe_sleeping) + return; + #ifndef WIN32 /* @@ -589,6 +595,7 @@ ResetLatch(Latch *latch) { /* Only the owner should reset the latch */ Assert(latch->owner_pid == MyProcPid); + Assert(latch->maybe_sleeping == false); latch->is_set = false; @@ -1273,6 +1280,14 @@ WaitEventSetWait(WaitEventSet *set, long timeout, * ordering, so that we cannot miss seeing is_set if a notification * has already been queued. */ + if (set->latch && !set->latch->is_set) + { + /* about to sleep on a latch */ + set->latch->maybe_sleeping = true; + pg_memory_barrier(); + /* and recheck */ + } + if (set->latch && set->latch->is_set) { occurred_events->fd = PGINVALID_SOCKET; @@ -1283,6 +1298,9 @@ WaitEventSetWait(WaitEventSet *set, long timeout, occurred_events++; returned_events++; + /* could have been set above */ + set->latch->maybe_sleeping = false; + break; } @@ -1294,6 +1312,12 @@ WaitEventSetWait(WaitEventSet *set, long timeout, rc = WaitEventSetWaitBlock(set, cur_timeout, occurred_events, nevents); + if (set->latch) + { + Assert(set->latch->maybe_sleeping); + set->latch->maybe_sleeping = false; + } + if (rc == -1) break;/* timeout occurred */ else @@ -1383,6 +1407,8 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout, if (cur_event->events == WL_LATCH_SET && cur_epoll_event->events & (EPOLLIN | EPOLLERR | EPOLLHUP)) { + Assert(set->latch); + /* There's data in the self-pipe, clear it. */ drainSelfPipe(); diff --git a/src/include/storage/latch.h b/src/include/storage/latch.h index 7c742021fb..d0da7e5d31 100644 --- a/src/include/storage/latch.h +++ b/src/include/storage/latch.h @@ -110,6 +110,7 @@ typedef struct Latch { sig_atomic_t is_set; + sig_atomic_t maybe_sleeping; bool is_shared; int owner_pid; #ifdef WIN32 -- 2.20.1 From 0cac26c1499fffa2a6f4e71f59e7a4e9a31be833 Mon Sep 17 00:00:00 2001 From: Thomas Munro Date: Sat, 8 Aug 2020 16:32:22 +1200 Subject: [PATCH 2/3] Remove self-pipe in WAIT_USE_EPOLL builds. Use SIGURG instead of SIGUSR1 for latch wake-ups, and unblock it only while in epoll_pwait() to avoid the race traditionally solved by the self-pipe trick. --- src/backend/libpq/pqsignal.c| 13 +++- src/backend/postmaster/postmaster.c | 4 +- src/backend/storage/ipc/latch.c | 97 +++-- src/backend/tcop/postgres.c | 2 +- src/include/libpq/pqsignal.h| 3 +- src/include/storage/latch.h | 2 + 6 files changed, 96 insertions(+), 25 deletions(-) diff --git a/src/backend/libpq/pqsignal.c b/src/backend/libpq/pqsignal.c index 9289493118..13ec3cd533 100644 --- a/src/backend/libpq/pqsignal.c +++ b/src/backend/libpq/pqsignal.c @@ -20,6 +20,7 @@ /* Global variables */ sigset_t UnBlockSig, + UnBlockAllSig, BlockSig, StartupBlockSig; @@ -35,13 +36,21 @@ sigset_t UnBlockSig, * collection; it's essentially BlockSig minus SIGTERM, SIGQUIT, SIGALRM. * * UnBlockSig is the set of