On Mon, Mar 7, 2016 at 3:37 PM, Robert Haas <robertmh...@gmail.com> wrote:
> The currently-committed code generates paths where nested loops and
> hash joins get pushed beneath the Gather node, but does not generate
> paths where merge joins have been pushed beneath the Gather node.  And
> the reason I didn't try to generate those paths is because I believe
> they will almost always suck.  As of now, what we know how to do is
> build a partial path for a join by joining a partial path for the
> outer input rel against an ordinary path for the inner rel.  That
> means that the work of generating the inner rel has to be redone in
> each worker.  That's not a problem if we've got something like a
> nested loop with a parameterized inner index scan, because that sort
> of plan redoes all the work for every row anyway.  It is a problem for
> a hash join, but it's not too hard for it to be worthwhile anyway if
> the build table is small.  For a merge join, though, it seems rather
> unpromising.  It's really doubtful that we want each worker to
> independently sort the inner rel and then have them join their own
> subset of the outer rel against their own copy of the sort.  *Maybe*
> it could win if the inner path is an index scan, but I wasn't really
> sure that would come up and be a win often enough to be worth the cost
> of generating the path.  We tend to only use merge joins when both of
> the relations involved are large, and index-scanning a large relation
> tends to lose to sorting it.  So it just seemed like a dead end.

This is the first message on this subthread that actually gave me a
feeling I understood the issue under discussion. It explains the
distinction between plans that are parallel-safe and plans that would
actually do something different under a parallel worker


Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to