On 10/07/2024 09:48, Thomas Munro wrote:
On Mon, Jul 8, 2024 at 9:38 PM Heikki Linnakangas <hlinn...@iki.fi> wrote:
The patch actually does both: it still does kill(SIGUSR1) and also sets
the latch.
Yeah, I had some ideas about supporting old extension code that really
wanted a SIGUSR1, but on reflection, the only reason anyone ever wants
that is so that sigusr1_handler can SetLatch(), which pairs with
WaitLatch() in WaitForBackgroundWorker*(). Let's go all the way and
assume that.
+1
I think it would be nice if RegisterDynamicBackgroundWorker() had a
"bool notify_me" argument, instead of requiring the caller to set
"bgw_notify_pid = MyProcPid" before the call. That's a
backwards-compatibility break, but maybe we should bite the bullet and
do it. Or we could do this in RegisterDynamicBackgroundWorker():
if (worker->bgw_notify_pid == MyProcPid)
worker->bgw_notify_pgprocno = MyProcNumber;
I think we can forbid setting pgw_notify_pid to anything else than 0 or
MyProcPid.
Another idea: we could keep the bgw_notify_pid field around for a
while, documented as unused and due to be removed in future. We could
automatically capture the caller's proc number. So you get latch
wakeups by default, which I expect many people want, and most people
could cope with even if they don't want them. If you really really
don't want them, you could set a new flag BGW_NO_NOTIFY.
Ok. I was going to say that it feels excessive to change the default
like that. However, searching for RegisterDynamicBackgroundWorker() in
github, I can't actually find any callers that don't set pg_notify_pid.
So yeah, make sense.
I have now done this part of the change in a new first patch. This
particular use of kill(SIGUSR1) is separate from the ProcSignal
removal, it's just that it relies on ProcSignal's handler's default
action of calling SetLatch(). It's needed so the ProcSignal-ectomy
can fully delete sigusr1_handler(), but it's not directly the same
thing, so it seemed good to split the patch.
PostmasterMarkPIDForWorkerNotify() is now unused. Which means that
bgworker_notify is never set, and BackgroundWorkerStopNotifications() is
never called either.
A SetLatchRobust would be nice. Looking at SetLatch(), I don't think it
can do any damage if you called it on a pointer to garbage, except if
the pointer itself is bogus, then just dereferencing it an cause a
segfault. So it would be nice to have a version specifically designed
with that in mind. For example, it could assume that the Latch's pid is
never legally equal to MyProcPid, because postmaster cannot own any latches.
Yeah I'm starting to think that all we need to do here is range-check
the proc number. Here's a version that adds:
ProcSetLatch(proc_number). Another idea would be for SetLatch(latch)
to sanitise the address of a latch, ie its offset and range.
Hmm, I don't think postmaster should trust ProcGlobal->allProcCount either.
The next problems to remove are, I think, the various SIGUSR2, SIGINT,
SIGTERM signals sent by the postmaster. These should clearly become
SendInterrupt() or ProcSetLatch().
+1
The problem here is that the
postmaster doesn't have the proc numbers yet. One idea is to teach
the postmaster to assign them! Not explored yet.
I've been thinking that we should:
a) assign every child process a PGPROC entry, and make postmaster
responsible for assigning them like you suggest. We'll need more PGPROC
entries, because currently a process doesn't reserve one until
authentication has happened. Or we change that behavior.
or
b) Use the pmsignal.c slot numbers for this, instead of ProcNumbers.
Postmaster already assigns those.
I'm kind of leaning towards b) for now, because it feels like a much
smaller patch. In the long run, it would be nice if every child process
had a ProcNumber, though. It was a nice simplification in v17 that we
don't have separate BackendId and ProcNumber anymore; similarly it would
be nice to not have separate PMChildSlot and ProcNumber concepts.
--
Heikki Linnakangas
Neon (https://neon.tech)