Re: Wait for parallel workers to attach

2018-02-02 Thread Peter Geoghegan
On Fri, Feb 2, 2018 at 6:11 AM, Robert Haas  wrote:
>> Fair enough, you can proceed with the patch.
>
> Committed.  Now, on to the main event!

Thank you both.

-- 
Peter Geoghegan



Re: Wait for parallel workers to attach

2018-02-02 Thread Robert Haas
On Thu, Feb 1, 2018 at 9:48 PM, Amit Kapila  wrote:
> On Thu, Feb 1, 2018 at 9:09 PM, Robert Haas  wrote:
>> On Wed, Jan 31, 2018 at 10:08 PM, Amit Kapila  
>> wrote:
>>> I think suggesting to use this API to wait "for a specific worker"
>>> doesn't seem like a good idea as it doesn't have any such provision.
>>
>> I see your point, but in the absence of a more specific API it could
>> be used that way, and it wouldn't be unreasonable.  Just might wait a
>> little longer than absolutely necessary.
>
> Fair enough, you can proceed with the patch.

Committed.  Now, on to the main event!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Wait for parallel workers to attach

2018-02-01 Thread Amit Kapila
On Thu, Feb 1, 2018 at 9:09 PM, Robert Haas  wrote:
> On Wed, Jan 31, 2018 at 10:08 PM, Amit Kapila  wrote:
>> I think suggesting to use this API to wait "for a specific worker"
>> doesn't seem like a good idea as it doesn't have any such provision.
>
> I see your point, but in the absence of a more specific API it could
> be used that way, and it wouldn't be unreasonable.  Just might wait a
> little longer than absolutely necessary.
>

Fair enough, you can proceed with the patch.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Wait for parallel workers to attach

2018-02-01 Thread Robert Haas
On Wed, Jan 31, 2018 at 10:08 PM, Amit Kapila  wrote:
> I think suggesting to use this API to wait "for a specific worker"
> doesn't seem like a good idea as it doesn't have any such provision.

I see your point, but in the absence of a more specific API it could
be used that way, and it wouldn't be unreasonable.  Just might wait a
little longer than absolutely necessary.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Wait for parallel workers to attach

2018-01-31 Thread Amit Kapila
On Wed, Jan 31, 2018 at 9:53 PM, Robert Haas  wrote:
> On Wed, Jan 31, 2018 at 3:57 AM, Amit Kapila  wrote:
>>> * There might be some opportunity to share some of the new code with
>>> the code recently committed to WaitForParallelWorkersToFinish(). For
>>> one thing, the logic in this block could be refactored into a
>>> dedicated function that is called by both
>>> WaitForParallelWorkersToAttach() and WaitForParallelWorkersToFinish():
>>
>> I had thought about this earlier but left it as the common code was
>> too less, however as you have pointed out, I had extracted the common
>> code into a separate function.
>
> I like it better the other way, so I've changed it back in the
> attached version,
>

Okay, no problem.

> which also works over the comments fairly heavily.
>

+ * However, if the leader needs to wait for
+ * all of its workers or for a specific worker, it may want to call this
+ * function before doing so.

I think suggesting to use this API to wait "for a specific worker"
doesn't seem like a good idea as it doesn't have any such provision.
Other than that the patch looks good.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Wait for parallel workers to attach

2018-01-31 Thread Robert Haas
On Wed, Jan 31, 2018 at 3:57 AM, Amit Kapila  wrote:
>> * There might be some opportunity to share some of the new code with
>> the code recently committed to WaitForParallelWorkersToFinish(). For
>> one thing, the logic in this block could be refactored into a
>> dedicated function that is called by both
>> WaitForParallelWorkersToAttach() and WaitForParallelWorkersToFinish():
>
> I had thought about this earlier but left it as the common code was
> too less, however as you have pointed out, I had extracted the common
> code into a separate function.

I like it better the other way, so I've changed it back in the
attached version, which also works over the comments fairly heavily.

> I think we should not touch anything related to Gather (merge) as they
> don't need it for the purpose of correctness.  However, we might want
> to improve them by using this new API at a certain point if the need
> arises.  I guess we can use this API to detect failures early.

I added a comment in this version explaining why it works, so that we
don't forget (again).  If we decide to change it in the future then we
can remove or update the comment.

Another thing I did was known_started_workers ->
known_attached_workers, which I think is more precisely correct.

Please let me know your thoughts about this version.  If it looks OK,
I'll commit it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


wait-for-attach-rmh.patch
Description: Binary data


Re: Wait for parallel workers to attach

2018-01-31 Thread Amit Kapila
On Mon, Jan 29, 2018 at 4:01 AM, Peter Geoghegan  wrote:
> On Sat, Jan 27, 2018 at 12:14 AM, Amit Kapila  wrote:
>
> I also found that all of these errors were propagated. The patch helps
> parallel CREATE INDEX as expected/designed.
>

Great!

> Some small things that I noticed about the patch:
>
> * Maybe "if (!known_started_workers[i])" should be converted to "if
> (known_started_workers[i]) continue;", to decrease the indentation
> level of the WaitForParallelWorkersToAttach() loop.
>

Changed as per suggestion.

> * There might be some opportunity to share some of the new code with
> the code recently committed to WaitForParallelWorkersToFinish(). For
> one thing, the logic in this block could be refactored into a
> dedicated function that is called by both
> WaitForParallelWorkersToAttach() and WaitForParallelWorkersToFinish():
>
>> +   else if (status == BGWH_STOPPED)
>> +   {
>> +   /*
>> +* Check whether the worker ended up stopped without ever
>> +* attaching to the error queue.  If so, the postmaster
>> +* was unable to fork the worker or it exited without
>> +* initializing properly.  We must throw an error, since
>> +* the caller may have been expecting the worker to do
>> +* some work before exiting.
>> +*/
>> +   mq = shm_mq_get_queue(pcxt->worker[i].error_mqh);
>> +   if (shm_mq_get_sender(mq) == NULL)
>> +   ereport(ERROR,
>> +   
>> (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
>> +errmsg("parallel worker failed to 
>> initialize"),
>> +errhint("More details may be available in 
>> the server log.")));
>> +
>> +   known_started_workers[i] = true;
>> +   ++nknown_started_workers;
>> +   }
>

I had thought about this earlier but left it as the common code was
too less, however as you have pointed out, I had extracted the common
code into a separate function.

> * If we don't actually commit the patch to make nodeGather.c call
> WaitForParallelWorkersToAttach(), which I suspect will happen, then I
> think we should instead at least have a comment saying why it's okay
> that we don't call WaitForParallelWorkersToAttach(). If we go this
> way, the comment should directly replace the
> modify_gather_to_wait_for_attach_v1.patch call to
> WaitForParallelWorkersToAttach() -- this comment should go in
> ExecGather().
>
> * Maybe the comments at the top of WaitForParallelWorkersToAttach()
> should at least allude to the ExecGather() special case I just
> mentioned.
>

I think we should not touch anything related to Gather (merge) as they
don't need it for the purpose of correctness.  However, we might want
to improve them by using this new API at a certain point if the need
arises.  I guess we can use this API to detect failures early.

> * Maybe the comments at the top of WaitForParallelWorkersToAttach()
> should also advise callers that it's a good idea to try to do other
> leader-only work that doesn't involve a WaitLatch() before they call.
>
> In general, WaitForParallelWorkersToAttach() needs to be called before
> any WaitLatch() (or barrier wait, or condition variable wait) that
> waits on workers, and after workers are first launched, but callers
> may be able to arrange to do plenty of other work, just like nbtsort.c
> does. To be clear: IMHO calling WaitForParallelWorkersToAttach()
> should be the rule, not the exception.
>
> * Finally, perhaps the comments at the top of
> WaitForParallelWorkersToAttach() should also describe how it relates
> to WaitForParallelWorkersToFinish().
>
> ISTM that WaitForParallelWorkersToAttach() is a subset of
> WaitForParallelWorkersToFinish(), that does all that is needed to rely
> on nworkers_launched actually being the number of worker processes
> that are attached to the error queue. As such, caller can expect
> propagation of errors from workers using standard parallel message
> interrupts once WaitForParallelWorkersToAttach() returns. You probably
> shouldn't directly reference nworkers_launched, though, since that
> doesn't necessarily have to be involved for parallel client code to
> run into trouble. (You only need to wait on workers changing something
> in shared memory, and failing to actively inform leader of failure
> through a parallel message -- this might not involve testing
> nworkers_launched in the way parallel CREATE INDEX happens to.)
>

I have updated the comments atop WaitForParallelWorkersToAttach() to
address your above two points.

> known_started_workers looks a lot like any_message_received.  Perhaps 
> any_message_received should be renamed to known_started_workers and reused 
> here.  After all, if we know that a
> worker 

Re: Wait for parallel workers to attach

2018-01-30 Thread Amit Kapila
On Wed, Jan 31, 2018 at 8:46 AM, Robert Haas  wrote:
> On Tue, Jan 30, 2018 at 10:10 PM, Amit Kapila  wrote:
>> I am not getting what exactly you are suggesting here.  The wait loop
>> is intended for the case when some workers are not started.  We want
>> to wait for sometime before checking again whether workers are
>> started. I wanted to avoid busy looping waiting for some worker to
>> start.  I think in most cases we don't need to wait, but for some
>> corner cases where postmaster didn't get chance to start a worker, we
>> should avoid busy looping waiting for a worker to start.
>
> I agree we need to avoid busy-looping.  What I'm saying is that we
> don't need a timeout.  Why do you think we need a timeout?
>

I thought we need it for worker startup, but now after again looking
at the code, it seems we do notify at worker startup as well.   So, we
don't need a timeout.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Wait for parallel workers to attach

2018-01-30 Thread Robert Haas
On Tue, Jan 30, 2018 at 10:10 PM, Amit Kapila  wrote:
>> known_started_workers looks a lot like any_message_received.  Perhaps
>> any_message_received should be renamed to known_started_workers and
>> reused here.
>
> Sure, that sounds good to me.  Do you prefer a separate patch for
> renaming any_message_received to known_started_workers or is it okay
> to have it along with the main patch?

A single patch sounds OK.

>>  After all, if we know that a worker was started, there's
>> no need for WaitForParallelWorkersToFinish to again call
>> GetBackgroundWorkerPid() for it.
>
> I think in above sentence you wanted to say
> WaitForParallelWorkersToAttach, not WaitForParallelWorkersToFinish.
> Am I right?

Yes.

>> I think that you shouldn't need the 10ms delay loop; waiting forever
>> should work.  If a work fails to start, the postmaster should send
>> SIGUSR1 which should set our latch.
>>
> I am not getting what exactly you are suggesting here.  The wait loop
> is intended for the case when some workers are not started.  We want
> to wait for sometime before checking again whether workers are
> started. I wanted to avoid busy looping waiting for some worker to
> start.  I think in most cases we don't need to wait, but for some
> corner cases where postmaster didn't get chance to start a worker, we
> should avoid busy looping waiting for a worker to start.

I agree we need to avoid busy-looping.  What I'm saying is that we
don't need a timeout.  Why do you think we need a timeout?  We have
bgw_notify_pid so that we will get a signal instead of having to poll.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Wait for parallel workers to attach

2018-01-30 Thread Amit Kapila
On Mon, Jan 29, 2018 at 8:25 PM, Robert Haas  wrote:
> On Sat, Jan 27, 2018 at 3:14 AM, Amit Kapila  wrote:
>> During the recent development of parallel operation (parallel create
>> index)[1], a need has been arised for $SUBJECT.  The idea is to allow
>> leader backend to rely on number of workers that are successfully
>> started.  This API allows leader to wait for all the workers to start
>> or fail even if one of the workers fails to attach.  We consider
>> workers started/attached once they are attached to error queue.  This
>> will ensure that any error after the workers are attached won't be
>> silently ignored by leader.
>
> known_started_workers looks a lot like any_message_received.  Perhaps
> any_message_received should be renamed to known_started_workers and
> reused here.
>

Sure, that sounds good to me.  Do you prefer a separate patch for
renaming any_message_received to known_started_workers or is it okay
to have it along with the main patch?

>  After all, if we know that a worker was started, there's
> no need for WaitForParallelWorkersToFinish to again call
> GetBackgroundWorkerPid() for it.
>

I think in above sentence you wanted to say
WaitForParallelWorkersToAttach, not WaitForParallelWorkersToFinish.
Am I right?

> I think that you shouldn't need the 10ms delay loop; waiting forever
> should work.  If a work fails to start, the postmaster should send
> SIGUSR1 which should set our latch.
>

I am not getting what exactly you are suggesting here.  The wait loop
is intended for the case when some workers are not started.  We want
to wait for sometime before checking again whether workers are
started. I wanted to avoid busy looping waiting for some worker to
start.  I think in most cases we don't need to wait, but for some
corner cases where postmaster didn't get chance to start a worker, we
should avoid busy looping waiting for a worker to start.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: Wait for parallel workers to attach

2018-01-29 Thread Robert Haas
On Sat, Jan 27, 2018 at 3:14 AM, Amit Kapila  wrote:
> During the recent development of parallel operation (parallel create
> index)[1], a need has been arised for $SUBJECT.  The idea is to allow
> leader backend to rely on number of workers that are successfully
> started.  This API allows leader to wait for all the workers to start
> or fail even if one of the workers fails to attach.  We consider
> workers started/attached once they are attached to error queue.  This
> will ensure that any error after the workers are attached won't be
> silently ignored by leader.

known_started_workers looks a lot like any_message_received.  Perhaps
any_message_received should be renamed to known_started_workers and
reused here.  After all, if we know that a worker was started, there's
no need for WaitForParallelWorkersToFinish to again call
GetBackgroundWorkerPid() for it.

I think that you shouldn't need the 10ms delay loop; waiting forever
should work.  If a work fails to start, the postmaster should send
SIGUSR1 which should set our latch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Wait for parallel workers to attach

2018-01-27 Thread Amit Kapila
During the recent development of parallel operation (parallel create
index)[1], a need has been arised for $SUBJECT.  The idea is to allow
leader backend to rely on number of workers that are successfully
started.  This API allows leader to wait for all the workers to start
or fail even if one of the workers fails to attach.  We consider
workers started/attached once they are attached to error queue.  This
will ensure that any error after the workers are attached won't be
silently ignored by leader.

I have used wait event as WAIT_EVENT_BGWORKER_STARTUP similar to
WaitForReplicationWorkerAttach, but we might want to change it.

I have tested this patch by calling this API in nodeGather.c and then
introducing failuires at various places: (a) induce fork failure for
background workers (force_fork_failure_v1.patch), (b) Exit parallel
worker before attaching to the error queue
(exit_parallel_worker_before_error_queue_attach_v1.patch) and (c) Exit
parallel worker after attaching to the error queue
(exit_parallel_worker_after_error_queue_attach_v1.patch).

In all above cases, I got the errors as expected.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1KgmdT3ivm1vG%2BrJzKOKeYQU2XLhp6ma5LMHxaG89brsA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


WaitForParallelWorkersToAttach_v1.patch
Description: Binary data


modify_gather_to_wait_for_attach_v1.patch
Description: Binary data


force_fork_failure_v1.patch
Description: Binary data


exit_parallel_worker_before_error_queue_attach_v1.patch
Description: Binary data


exit_parallel_worker_after_error_queue_attach_v1.patch
Description: Binary data