On Mon, Mar 6, 2017 at 8:15 AM, Dilip Kumar <dilipbal...@gmail.com> wrote:
> On Fri, Mar 3, 2017 at 3:57 PM, Robert Haas <robertmh...@gmail.com> wrote:
>> I'm not happy with the way this patch can just happen to latch on to a
>> path that's not parallel-safe rather than one that is and then just
>> give up on a merge join in that case.  I already made this argument in
>> https://www.postgresql.org/message-id/ca+tgmobdw2au1jq5l4ysa2zhqfma-qnvd7zfazbjwm3c0ys...@mail.gmail.com
>> and my opinion hasn't changed.  I've subsequently come to realize that
>> this problem is more widespread that I initially realized: not only do
>> sort_inner_and_outer() and generate_partial_mergejoin_paths() give up
>> without searching for the cheapest parallel-safe path, but the latter
>> later calls get_cheapest_path_for_pathkeys() and then just pretends it
>> didn't find anything unless the result is parallel-safe.  This makes
>> no sense to me: the cheapest parallel-safe path could be 1% more
>> expensive than the cheapest path of any kind, but because it's not the
>> cheapest we arbitrarily skip it, even though parallelizing more of the
>> tree might leave us way ahead overall.
> for sort_inner_and_outer I have followed the same logic what
> hash_inner_and_outer is doing.  Also moved the logic of selecting the
> cheapest_safe_inner out of the pathkey loop.

+    /* Generate partial path if inner is parallel safe. */
+    if (inner_cheapest_total->parallel_safe)
+        try_partial_mergejoin_path(root,
+                                   joinrel,
+                                   outerpath,
+                                   inner_cheapest_total,
+                                   merge_pathkeys,
+                                   mergeclauses,
+                                   NIL,
+                                   innersortkeys,
+                                   jointype,
+                                   extra);
+    /* Can't do anything else if inner path needs to be unique'd */
+    if (save_jointype == JOIN_UNIQUE_INNER)
+        return;

Right after this, you should try_partial_mergejoin_path() with the
result of get_cheapest_parallel_safe_total_inner(), just like
sort_inner_and_outer() already does.

I think with some work you could merge generate_mergejoin_paths with
generate_partial_mergejoin_paths instead of duplicating all the code.
They're not really that different, and you could reduce the
differences further if you made try_mergejoin_path() take an
additional is_partial argument and pass the call to
try_partial_mergejoin_path() if it's true.  The various bits of
special handling related to join types that aren't supported in
parallel queries aren't going to help you in parallel mode, but they
won't hurt either.  This bit:

+    /* Generate partial path if inner is parallel safe. */
+    if (inner_cheapest_total->parallel_safe)
+        try_partial_mergejoin_path(root,

...is really not right.  That value is being passed down from
consider_parallel_mergejoin(), which is getting it from
match_unsorted_outer(), which really shouldn't be calling this code in
the first place with a path that's not parallel-safe.  What it ought
to do is (a) pass inner_cheapest_total if that's non-NULL and
parallel-safe, or else (b) call
get_cheapest_parallel_safe_total_inner() and pass that value, (c)
unless that's NULL also in which case it should skip the call

Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Reply via email to