On Mon, Mar 6, 2017 at 8:15 AM, Dilip Kumar <dilipbal...@gmail.com> wrote: > On Fri, Mar 3, 2017 at 3:57 PM, Robert Haas <robertmh...@gmail.com> wrote: >> I'm not happy with the way this patch can just happen to latch on to a >> path that's not parallel-safe rather than one that is and then just >> give up on a merge join in that case. I already made this argument in >> https://www.postgresql.org/message-id/ca+tgmobdw2au1jq5l4ysa2zhqfma-qnvd7zfazbjwm3c0ys...@mail.gmail.com >> and my opinion hasn't changed. I've subsequently come to realize that >> this problem is more widespread that I initially realized: not only do >> sort_inner_and_outer() and generate_partial_mergejoin_paths() give up >> without searching for the cheapest parallel-safe path, but the latter >> later calls get_cheapest_path_for_pathkeys() and then just pretends it >> didn't find anything unless the result is parallel-safe. This makes >> no sense to me: the cheapest parallel-safe path could be 1% more >> expensive than the cheapest path of any kind, but because it's not the >> cheapest we arbitrarily skip it, even though parallelizing more of the >> tree might leave us way ahead overall. > > for sort_inner_and_outer I have followed the same logic what > hash_inner_and_outer is doing. Also moved the logic of selecting the > cheapest_safe_inner out of the pathkey loop.
+ /* Generate partial path if inner is parallel safe. */ + if (inner_cheapest_total->parallel_safe) + try_partial_mergejoin_path(root, + joinrel, + outerpath, + inner_cheapest_total, + merge_pathkeys, + mergeclauses, + NIL, + innersortkeys, + jointype, + extra); + + /* Can't do anything else if inner path needs to be unique'd */ + if (save_jointype == JOIN_UNIQUE_INNER) + return; Right after this, you should try_partial_mergejoin_path() with the result of get_cheapest_parallel_safe_total_inner(), just like sort_inner_and_outer() already does. I think with some work you could merge generate_mergejoin_paths with generate_partial_mergejoin_paths instead of duplicating all the code. They're not really that different, and you could reduce the differences further if you made try_mergejoin_path() take an additional is_partial argument and pass the call to try_partial_mergejoin_path() if it's true. The various bits of special handling related to join types that aren't supported in parallel queries aren't going to help you in parallel mode, but they won't hurt either. This bit: + /* Generate partial path if inner is parallel safe. */ + if (inner_cheapest_total->parallel_safe) + try_partial_mergejoin_path(root, ...is really not right. That value is being passed down from consider_parallel_mergejoin(), which is getting it from match_unsorted_outer(), which really shouldn't be calling this code in the first place with a path that's not parallel-safe. What it ought to do is (a) pass inner_cheapest_total if that's non-NULL and parallel-safe, or else (b) call get_cheapest_parallel_safe_total_inner() and pass that value, (c) unless that's NULL also in which case it should skip the call altogether. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company -- Sent via pgsql-hackers mailing list (email@example.com) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers