[
https://issues.apache.org/jira/browse/HIVE-14731?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16220885#comment-16220885
]
Zhiyuan Yang commented on HIVE-14731:
-------------------------------------
[~kgyrtkirk] The main reason is mapjoin decide parallelism according to
#splits, which is not good enough for cross product. The cost of xprod is
mostly determined by #records instead of #split.
> Use Tez cartesian product edge in Hive (unpartitioned case only)
> ----------------------------------------------------------------
>
> Key: HIVE-14731
> URL: https://issues.apache.org/jira/browse/HIVE-14731
> Project: Hive
> Issue Type: Bug
> Reporter: Zhiyuan Yang
> Assignee: Zhiyuan Yang
> Attachments: HIVE-14731.1.patch, HIVE-14731.10.patch,
> HIVE-14731.11.patch, HIVE-14731.12.patch, HIVE-14731.13.patch,
> HIVE-14731.14.patch, HIVE-14731.15.patch, HIVE-14731.16.patch,
> HIVE-14731.17.patch, HIVE-14731.18.patch, HIVE-14731.19.patch,
> HIVE-14731.2.patch, HIVE-14731.20.patch, HIVE-14731.21.patch,
> HIVE-14731.22.patch, HIVE-14731.23.patch, HIVE-14731.3.patch,
> HIVE-14731.4.patch, HIVE-14731.5.patch, HIVE-14731.6.patch,
> HIVE-14731.7.patch, HIVE-14731.8.patch, HIVE-14731.9.patch,
> HIVE-14731.addendum.patch
>
>
> Given cartesian product edge is available in Tez now (see TEZ-3230), let's
> integrate it into Hive on Tez. This allows us to have more than one reducer
> in cross product queries.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)