On Wed, Apr 11, 2018 at 12:03 PM, Andres Freund <and...@anarazel.de> wrote:
> On 2018-04-11 11:57:20 +1200, Thomas Munro wrote:
>> Then if pgarch_ArchiverCopyLoop() and HandleStartupProcInterrupts()
>> (ie loops without waiting) adopt a prctl(PR_SET_PDEATHSIG)-based
>> approach where available as suggested by Andres[2] or fall back to
>> polling a reusable WaitEventSet (timeout = 0), then there'd be no more
>> calls to PostmasterIsAlive() outside latch.c.
>
> I'm still unconvinced by this. There's good reasons why code might be
> busy-looping without checking the latch, and we shouldn't force code to
> add more latch checks if unnecessary. Resetting the latch frequently can
> actually increase the amount of context switches considerably.

What latch?  There wouldn't be a latch in a WaitEventSet that is used
only to detect postmaster death.

I arrived at this idea via the realisation that the closest thing to
prctl(PR_SET_PDEATHSIG) on BSD-family systems today is
please-tell-my-kqueue-if-this-process-dies.  It so happens that my
kqueue patch already uses that instead of the pipe to detect
postmaster death.  The only problem is that you have to ask it, by
calling it kevent().  In a busy loop like those two, where there is no
other kind of waiting going on, you could do that by calling kevent()
with timeout = 0 to check the queue.

You could probably figure out a way to hide the
prctl(PR_SET_PDEATHSIG)-based approach inside the WaitEventSet code,
with a fast path that doesn't make any system calls if the only event
registered is postmaster death (you can just check the global variable
set by your signal handler).  But I guess you wouldn't like the extra
function call so I guess you'd prefer to check the global variable
directly in the busy loop, in builds that have
prctl(PR_SET_PDEATHSIG).

-- 
Thomas Munro
http://www.enterprisedb.com

Reply via email to