Re: pgsql: Add parallel-aware hash joins.

Thomas Munro Thu, 04 Jan 2018 14:15:58 -0800

On Fri, Jan 5, 2018 at 5:00 AM, Tom Lane <[email protected]> wrote:
> The early returns indicate that that problem is fixed;


Thanks for your help and patience with that.  I've made a list over
here so we don't lose track of the various things that should be
improved in this area, and will start a new thread when I have patches
to propose:  https://wiki.postgresql.org/wiki/Parallel_Hash

> but now that the
> noise level is down, it's possible to see that brolga is showing an actual
> crash in the PHJ test, perhaps one time in four.  So we're not out of
> the woods yet.  It seems to consistently look like this:
>
> 2017-12-21 17:34:52.092 EST [2252:4] LOG:  background worker "parallel 
> worker" (PID 3584) was terminated by signal 11
> 2017-12-21 17:34:52.092 EST [2252:5] DETAIL:  Failed process was running: 
> select count(*) from foo
>           left join (select b1.id, b1.t from bar b1 join bar b2 using (id)) ss
>           on foo.id < ss.id + 1 and foo.id > ss.id - 1;
> 2017-12-21 17:34:52.092 EST [2252:6] LOG:  terminating any other active 
> server processes

That is a test of a parallel-aware hash join with a rescan (ie workers
get restarted repeatedly by the gather node reusing the DSM; maybe I
misunderstood some detail of the protocol for that).  I'll go and
review that code and try to reproduce the failure.  On the off-chance,
Andrew, is there any chance you have a core dump you could pull a
backtrace out of, on brolga?

-- 
Thomas Munro
http://www.enterprisedb.com

Re: pgsql: Add parallel-aware hash joins.

Reply via email to