On Sat, Feb 18, 2017 at 11:53 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> It's a little surprising to me that the delay we're seeing here is
>> significant, because the death of a child should cause an immediate
>> SIGCHLD, resulting in a call to reaper(), resulting in a call to
>> waitpid().  Why's that not working?
>
> I can't say for sure about gharial, but gaur/pademelon is a single-CPU
> machine, ...

Ah, OK.  Makes sense.

>>> There seem to be many reasons why exit of background workers is done
>>> by postmaster like when they have to restart after a crash or if
>>> someone terminates them (TerminateBackgroundWorker()) or if the
>>> notification has to be sent at exit (bgw_notify_pid).  Moreover, there
>>> can be background workers without shared memory access as well which
>>> can't clear state from shared memory.  Another point to think is that
>>> background workers contain user-supplied code, so not sure how good it
>>> is to give them control of tinkering with shared data structures of
>>> the backend.  Now, one can argue that for some of the cases where it
>>> is possible background worker should cleanup shared memory state at
>>> the exit, but I think it is not directly related to the parallel
>>> query.  As far as the parallel query is concerned, it just needs to
>>> wait for workers exit to ensure that no more operations can be
>>> performed in workers after that point so that it can accumulate stats
>>> and retrieve some fixed parallel state information.
>
>> That's a pretty good list of reasons with which I agree.
>
> It seems largely bogus IMO, except possibly the "no access to shared
> memory" reason, which is irrelevant for the problem at hand; I see no very
> probable reason why backends would ever be interested in launching workers
> that they couldn't communicate with.  In particular, I'll buy the
> "user-supplied code" one only if we rip out every possibility of executing
> user-supplied C or plperlu or plpythonu code in standard backends.  There
> is *no* good reason for treating parallel workers as less reliable than
> standard backends; if anything, given the constraints on what we let them
> do, they should be more reliable.

I don't understand this paragraph at all.  There are certainly use
cases for backends starting workers with which they can't communicate,
like starting some kind of a daemon - that was the original purpose of
the background worker facility.  But even when the process is planning
to communicate with the worker, there's a time after the worker is
requested and before that communication link is established.  Failures
during the intervening time have to be handled cleanly, and if you
think that's trivial to do without postmaster involvement then I think
you don't really understand the problem yet.

> Basically, I think we need to fix things so that
> WaitForParallelWorkersToFinish doesn't return until the slot is free
> and there are no impediments to launching another worker.  Anything
> less is going to result in race conditions and intermittent misbehavior
> on heavily-loaded machines.  Personally I think that it'd be a good idea
> to get the postmaster out of the loop while we're at it, but the minimum
> change is to wait for the slot to be marked free by the postmaster.

Such a change can be made, but as I pointed out in the part you didn't
quote, there are reasons to wonder whether that will be a constructive
change in real life even if it's better for the regression tests.
Optimizing PostgreSQL for the use case of running regression tests in
the buildfarm at the expense of other use cases wouldn't be very
smart.  Maybe such a change is better in real-world applications too,
but that deserves at least a little bit of thought and substantive
discussion.

>>> I am not sure why you think so.
>
> In other words, you are willing to accept a system in which we can never
> be sure that any query after the first one in select_parallel actually ran
> in parallel?  That seems silly on its face, and an enormous restriction on
> what we'll actually be able to test.

Without denying that we may want to change the behavior here, if you
think this is the first place in the regression tests where the
behavior is timing-dependent, you obviously didn't attend
https://www.pgcon.org/2016/schedule/track/Hacking/927.en.html

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to