Re: [HACKERS] Instability in select_parallel regression test

Tom Lane Fri, 17 Feb 2017 07:46:39 -0800

Amit Kapila <[email protected]> writes:
> On Fri, Feb 17, 2017 at 11:22 AM, Tom Lane <[email protected]> wrote:
>> In short, it looks to me like ExecShutdownGatherWorkers doesn't actually
>> wait for parallel workers to finish (as its comment suggests is
>> necessary), so that on not-too-speedy machines the worker slots may all
>> still be in use when the next command wants some.


> ExecShutdownGatherWorkers() do wait for workers to exit/finish, but it
> doesn't wait for the postmaster to free the used slots and that is how
> that API is supposed to work.  There is good chance that on slow
> machines the slots get freed up much later by postmaster after the
> workers have exited.

That seems like a seriously broken design to me, first because it can make
for a significant delay in the slots becoming available (which is what's
evidently causing these regression failures), and second because it's
simply bad design to load extra responsibilities onto the postmaster.
Especially ones that involve touching shared memory.

I think this needs to be changed, and promptly.  Why in the world don't
you simply have the workers clearing their slots when they exit?
We don't have an expectation that regular backends are incompetent to
clean up after themselves.  (Obviously, a crash exit is a different
case.)

> I think what we need to do
> here is to move the test that needs workers to execute before other
> parallel query tests where there is no such requirement.

That's not fixing the problem, it's merely averting your eyes from
the symptom.

                        regards, tom lane


-- 
Sent via pgsql-hackers mailing list ([email protected])
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Instability in select_parallel regression test

Reply via email to