On Fri, Mar 13, 2015 at 7:00 PM, Robert Haas <robertmh...@gmail.com> wrote:
>
> On Fri, Mar 13, 2015 at 8:59 AM, Amit Kapila <amit.kapil...@gmail.com>
wrote:
> > We can't directly call DestroyParallelContext() to terminate workers as
> > it can so happen that by that time some of the workers are still not
> > started.
>
> That shouldn't be a problem.  TerminateBackgroundWorker() not only
> kills an existing worker if there is one, but also tells the
> postmaster that if it hasn't started the worker yet, it should not
> bother.  So at the conclusion of the first loop inside
> DestroyParallelContext(), every running worker will have received
> SIGTERM and no more workers will be started.
>

The problem occurs in second loop inside DestroyParallelContext()
where it calls WaitForBackgroundWorkerShutdown().  Basically
WaitForBackgroundWorkerShutdown() just checks for BGWH_STOPPED
status, refer below code in parallel-mode patch:

+ status = GetBackgroundWorkerPid(handle, &pid);
+ if (status == BGWH_STOPPED)
+ return status;

So if the status here returned is BGWH_NOT_YET_STARTED, then it
will go for WaitLatch and will there forever.

I think fix is to check if status is BGWH_STOPPED or  BGWH_NOT_YET_STARTED,
then just return the status.

What do you say?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Reply via email to