Andres Freund <> writes:
> On 2017-04-20 00:50:13 -0400, Tom Lane wrote:
>> My first reaction was that that sounded like a lot more work than removing
>> two lines from maybe_start_bgworker and adjusting some comments.  But on
>> closer inspection, the slow-bgworker-start issue isn't the only problem
>> here.

> FWIW, I vaguely remember somewhat related issues on x86/linux too.

After sleeping and thinking more, I've realized that the
slow-bgworker-start issue actually exists on *every* platform, it's just
harder to hit when select() is interruptable.  But consider the case
where multiple bgworker-start requests arrive while ServerLoop is
actively executing (perhaps because a connection request just came in).
The postmaster has signals blocked, so nothing happens for the moment.
When we go around the loop and reach


the pending SIGUSR1 is delivered, and sigusr1_handler reads all the
bgworker start requests, and services just one of them.  Then control
returns and proceeds to

            selres = select(nSockets, &rmask, NULL, NULL, &timeout);

But now there's no interrupt pending.  So the remaining start requests
do not get serviced until (a) some other postmaster interrupt arrives,
or (b) the one-minute timeout elapses.  They could be waiting awhile.

Bottom line is that any request for more than one bgworker at a time
faces a non-negligible risk of suffering serious latency.

I'm coming back to the idea that at least in the back branches, the
thing to do is allow maybe_start_bgworker to start multiple workers.
Is there any actual evidence for the claim that that might have
bad side effects?

                        regards, tom lane

Sent via pgsql-hackers mailing list (
To make changes to your subscription:

Reply via email to