On Tue, Jun 6, 2017 at 2:21 PM, Tom Lane <t...@sss.pgh.pa.us> wrote:
>> One thought is that the only places where shm_mq_set_sender() should
>> be getting invoked during the main regression tests are
>> ParallelWorkerMain() and ExecParallelGetReceiver, and both of those
>> places using ParallelWorkerNumber to figure out what address to pass.
>> So if ParallelWorkerNumber were getting set to the same value in two
>> different parallel workers - e.g. because the postmaster went nuts and
>> launched two processes instead of only one - or if
>> ParallelWorkerNumber were not getting initialized at all or were
>> getting initialized to some completely bogus value, it could cause
>> this symptom.
>
> Hmm.  With some generous assumptions it'd be possible to think that
> aa1351f1eec4adae39be59ce9a21410f9dd42118 triggered this.  That commit was
> present in 20 successful lorikeet runs before the first of these failures,
> which is a bit more than the MTBF after that, but not a huge amount more.
>
> That commit in itself looks innocent enough, but could it have exposed
> some latent bug in bgworker launching?

Hmm, that's a really interesting idea, but I can't quite put together
a plausible theory around it.  I mean, it seems like that commit could
make launching bgworkers faster, which could conceivably tickle some
heretofore-latent timing-related bug.  But it wouldn't, IIUC, make the
first worker start any faster than before - it would just make them
more closely-spaced thereafter, and it's not very obvious how that
would cause a problem.

Another idea is that the commit in question is managing to corrupt
BackgroundWorkerList somehow.  maybe_start_bgworkers() is using
slist_foreach_modify(), but previously it always returned after
calling do_start_bgworker, and now it doesn't.  So if
do_start_bgworker() did something that could modify the list
structure, then perhaps maybe_start_bgworkers() would get confused.  I
don't really think that this theory has any legs, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to