I wrote:
> I've not tried to trace the code, but I'm now a bit suspicious
> that there is indeed a design bug here. I gather from the
> comments that parallel_register_count is incremented by the
> worker processes, which of course implies that a worker that
> fails to reattach to shared memory won't do that. But
> parallel_terminate_count is incremented by the postmaster.
> If the postmaster will do that even in the case of a worker that
> failed at startup, then lorikeet's symptoms are neatly explained.
That theory seems to be nonsense. After a bit more study of the
code, I see that parallel_register_count is incremented by the *leader*
process, when it reserves a BackgroundWorkerSlot for the worker.
And parallel_terminate_count is incremented by the postmaster when
it releases the slot; so it's darn hard to see how
parallel_terminate_count could get ahead of parallel_register_count.
I noticed that lorikeet's worker didn't fail at shared memory reattach,
as I first thought, anyway. It failed at
ERROR: could not map dynamic shared memory segment
which means we ought to be able to reproduce the symptoms by faking
failure of dsm_attach(), as I did in the quick hack attached.
What I get is a lot of "parallel worker failed to initialize" and
"lost connection to parallel worker" errors, but no assertion.
(I also tried this with an EXEC_BACKEND build, just in case that'd
change the behavior, but it didn't.) So it seems like the "lorikeet
is flaky" theory is looking pretty plausible.
I do see what seems to be a bug-let in ForgetBackgroundWorker.
BackgroundWorkerStateChange is careful to do this when freeing
a slot:
/*
* We need a memory barrier here to make sure that the load of
* bgw_notify_pid and the update of parallel_terminate_count
* complete before the store to in_use.
*/
notify_pid = slot->worker.bgw_notify_pid;
if ((slot->worker.bgw_flags & BGWORKER_CLASS_PARALLEL) != 0)
BackgroundWorkerData->parallel_terminate_count++;
pg_memory_barrier();
slot->pid = 0;
slot->in_use = false;
but the mainline case in ForgetBackgroundWorker is a lot less
paranoid:
Assert(rw->rw_shmem_slot < max_worker_processes);
slot = &BackgroundWorkerData->slot[rw->rw_shmem_slot];
if ((rw->rw_worker.bgw_flags & BGWORKER_CLASS_PARALLEL) != 0)
BackgroundWorkerData->parallel_terminate_count++;
slot->in_use = false;
One of these functions is mistaken. However, I can't construct
a theory whereby that explains lorikeet's symptoms, mainly because
Intel chips don't do out-of-order stores so the messing with
parallel_terminate_count should be done before in_use is cleared,
even without an explicit memory barrier.
regards, tom lane
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 3550ef13ba..d3b00c2f9e 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1278,6 +1278,11 @@ ParallelWorkerMain(Datum main_arg)
"Parallel worker",
ALLOCSET_DEFAULT_SIZES);
+ if (random() < INT_MAX / 64)
+ ereport(ERROR,
+ (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("fake failure to map dynamic shared memory segment")));
+
/*
* Attach to the dynamic shared memory segment for the parallel query, and
* find its table of contents.