Robert Haas <robertmh...@gmail.com> writes: > On Thu, Jun 15, 2017 at 10:05 AM, Tom Lane <t...@sss.pgh.pa.us> wrote: >> But we know, from the subsequent failed assertion, that the leader was >> still trying to launch parallel workers. So that particular theory >> doesn't hold water.
> Is there any chance that it's already trying to launch parallel > workers for the *next* query? Oh! Yeah, you might be right, because the trace includes a statement LOG entry from the leader in between: 2017-06-13 16:44:57.179 EDT [59404ec6.2758:63] LOG: statement: EXPLAIN (analyze, timing off, summary off, costs off) SELECT * FROM tenk1; 2017-06-13 16:44:57.247 EDT [59404ec9.2e78:1] ERROR: could not map dynamic shared memory segment 2017-06-13 16:44:57.248 EDT [59404dec.2d9c:5] LOG: worker process: parallel worker for PID 10072 (PID 11896) exited with exit code 1 2017-06-13 16:44:57.273 EDT [59404ec6.2758:64] LOG: statement: select stringu1::int2 from tenk1 where unique1 = 1; TRAP: FailedAssertion("!(BackgroundWorkerData->parallel_register_count - BackgroundWorkerData->parallel_terminate_count <= 1024)", File: "/home/andrew/bf64/root/HEAD/pgsql.build/../pgsql/src/backend/postmaster/bgworker.c", Line: 974) 2017-06-13 16:45:02.652 EDT [59404dec.2d9c:6] LOG: server process (PID 10072) was terminated by signal 6: Aborted It's fairly hard to read this other than as telling us that the worker was launched for the EXPLAIN (although really? why aren't we skipping that if EXEC_FLAG_EXPLAIN_ONLY?), and then we see a new LOG entry for the next statement before the leader hits its assertion failure. > Could be -- but it could also be timing-related. If we are in fact > using cygwin's fork emulation, the documentation for it explains that > it's slow: https://www.cygwin.com/faq.html#faq.api.fork > Interestingly, it also mentions that making it work requires > suspending the parent while the child is starting up, which probably > does not happen on any other platform. Of course it also makes my > theory that the child doesn't reach dsm_attach() before the parent > finishes the query pretty unlikely. Well, if this was a worker launched during InitPlan() for an EXPLAIN, the leader would have shut down the query almost immediately after launching the worker. So it does fit pretty well as long as you're willing to believe that the leader got to run before the child. But what this theory doesn't explain is: why haven't we seen this before? It now seems like it ought to come up often, since there are several EXPLAINs for parallel queries in that test. regards, tom lane -- Sent via pgsql-hackers mailing list (firstname.lastname@example.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers