During the recent development of parallel operation (parallel create
index)[1], a need has been arised for $SUBJECT.  The idea is to allow
leader backend to rely on number of workers that are successfully
started.  This API allows leader to wait for all the workers to start
or fail even if one of the workers fails to attach.  We consider
workers started/attached once they are attached to error queue.  This
will ensure that any error after the workers are attached won't be
silently ignored by leader.

I have used wait event as WAIT_EVENT_BGWORKER_STARTUP similar to
WaitForReplicationWorkerAttach, but we might want to change it.

I have tested this patch by calling this API in nodeGather.c and then
introducing failuires at various places: (a) induce fork failure for
background workers (force_fork_failure_v1.patch), (b) Exit parallel
worker before attaching to the error queue
(exit_parallel_worker_before_error_queue_attach_v1.patch) and (c) Exit
parallel worker after attaching to the error queue
(exit_parallel_worker_after_error_queue_attach_v1.patch).

In all above cases, I got the errors as expected.

[1] - 
https://www.postgresql.org/message-id/CAA4eK1KgmdT3ivm1vG%2BrJzKOKeYQU2XLhp6ma5LMHxaG89brsA%40mail.gmail.com

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachment: WaitForParallelWorkersToAttach_v1.patch
Description: Binary data

Attachment: modify_gather_to_wait_for_attach_v1.patch
Description: Binary data

Attachment: force_fork_failure_v1.patch
Description: Binary data

Attachment: exit_parallel_worker_before_error_queue_attach_v1.patch
Description: Binary data

Attachment: exit_parallel_worker_after_error_queue_attach_v1.patch
Description: Binary data

Reply via email to