On Tue, Mar 7, 2017 at 5:21 AM, Robert Haas <robertmh...@gmail.com> wrote: > + /* Can't do anything else if inner path needs to be unique'd */ > + if (save_jointype == JOIN_UNIQUE_INNER) > + return; > > Right after this, you should try_partial_mergejoin_path() with the > result of get_cheapest_parallel_safe_total_inner(), just like > sort_inner_and_outer() already does. > > I think with some work you could merge generate_mergejoin_paths with > generate_partial_mergejoin_paths instead of duplicating all the code. > They're not really that different, and you could reduce the > differences further if you made try_mergejoin_path() take an > additional is_partial argument and pass the call to > try_partial_mergejoin_path() if it's true. The various bits of > special handling related to join types that aren't supported in > parallel queries aren't going to help you in parallel mode, but they > won't hurt either. This bit: > > + /* Generate partial path if inner is parallel safe. */ > + if (inner_cheapest_total->parallel_safe) > + try_partial_mergejoin_path(root, > > ...is really not right. That value is being passed down from > consider_parallel_mergejoin(), which is getting it from > match_unsorted_outer(), which really shouldn't be calling this code in > the first place with a path that's not parallel-safe. What it ought > to do is (a) pass inner_cheapest_total if that's non-NULL and > parallel-safe, or else (b) call > get_cheapest_parallel_safe_total_inner() and pass that value, (c) > unless that's NULL also in which case it should skip the call > altogether.
I have tried to address above comments in the latest patch, But there is one part of the patch which I am not quite sure yet. So still this patch is not a final one. + * + * If inner_cheapest_total is not NULL or not a parallel safe path then + * find the cheapest total parallel safe path. + */ + if (inner_cheapest_total == NULL || + inner_cheapest_total->parallel_safe == false) + { + inner_cheapest_total = + get_cheapest_parallel_safe_total_inner(innerrel->pathlist); + } + + if (inner_cheapest_total) + consider_parallel_mergejoin(root, joinrel, outerrel, innerrel, + save_jointype, extra, + inner_cheapest_total); I am confused about whether to call "get_cheapest_parallel_safe_total_inner" with innerrel->cheapest_parameterized_paths like we do in case of hash_inner_and_outer or with innerrel->pathlist. The reason behind I am calling this with complete pathlist (innerrel->pathlist) is that inside generate_mergejoin_paths it calls "get_cheapest_path_for_pathkeys" for complete pathlist and if we decide not to call consider_parallel_mergejoin if the cheapest parallel safe path in innerrel->cheapest_parameterized_paths is NULL then it will not be fair (we might have some parallel safe paths in innerrel->pathlist). -- Regards, Dilip Kumar EnterpriseDB: http://www.enterprisedb.com
parallel_mergejoin_v10.patch
Description: Binary data
-- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers