On Fri, Mar 3, 2017 at 3:57 PM, Robert Haas <robertmh...@gmail.com> wrote:
> I'm not happy with the way this patch can just happen to latch on to a
> path that's not parallel-safe rather than one that is and then just
> give up on a merge join in that case.  I already made this argument in
> https://www.postgresql.org/message-id/ca+tgmobdw2au1jq5l4ysa2zhqfma-qnvd7zfazbjwm3c0ys...@mail.gmail.com
> and my opinion hasn't changed.  I've subsequently come to realize that
> this problem is more widespread that I initially realized: not only do
> sort_inner_and_outer() and generate_partial_mergejoin_paths() give up
> without searching for the cheapest parallel-safe path, but the latter
> later calls get_cheapest_path_for_pathkeys() and then just pretends it
> didn't find anything unless the result is parallel-safe.  This makes
> no sense to me: the cheapest parallel-safe path could be 1% more
> expensive than the cheapest path of any kind, but because it's not the
> cheapest we arbitrarily skip it, even though parallelizing more of the
> tree might leave us way ahead overall.

for sort_inner_and_outer I have followed the same logic what
hash_inner_and_outer is doing.  Also moved the logic of selecting the
cheapest_safe_inner out of the pathkey loop.

>
> I suggest that we add an additional "bool require_parallel_safe"
> argument to get_cheapest_path_for_pathkeys(); when false, the function
> will behave as at present, but when true, it will skip paths that are
> not parallel-safe.  And similarly have a function to find the cheapest
> parallel-safe path if the first one in the list isn't it.  See
> attached draft patch.  Then this code can search for the correct path
> instead of searching using the wrong criteria and then giving up if it
> doesn't get the result it wants.

Done as suggested.  Also, rebased the path_search_for_mergejoin on
head and updated the comments of get_cheapest_path_for_pathkeys for
new argument.

>
> +    if (!(joinrel->consider_parallel &&
> +          save_jointype != JOIN_UNIQUE_OUTER &&
> +          save_jointype != JOIN_FULL &&
> +          save_jointype != JOIN_RIGHT &&
> +          outerrel->partial_pathlist != NIL &&
> +          bms_is_empty(joinrel->lateral_relids)))
>
> I would distribute the negation, so that this reads if
> (!joinrel->consider_parallel || save_jointype == JOIN_UNIQUE_OUTER ||
> save_jointype == JOIN_FULL || ...).  Or maybe better yet, just drop
> the ! and the return that follows and put the
> consider_parallel_nestloop and consider_parallel_mergejoin calls
> inside the if-block.

Done
>
> You could Assert(nestjoinOK) instead of testing it, although I guess
> it doesn't really matter.

left as it is.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com

Attachment: parallel_mergejoin_v9.patch
Description: Binary data

Attachment: path-search-for-mergejoin-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Reply via email to