[
https://issues.apache.org/jira/browse/HIVE-1695?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12965892#action_12965892
]
He Yongqiang commented on HIVE-1695:
------------------------------------
To be accurate, MapJoin followed by ReduceSink followed by GroupBy should be
in a one mapreduce job.
So at some point (like processing MapJoin%SEL), we know this mapjoin is
followed by a reducesink.
And at that point if we know the reduce sink is for a group by, then we can
just try to skip the work splitting the task.
would that be easier?
> MapJoin followed by ReduceSink should be done as single MapReduce Job
> ---------------------------------------------------------------------
>
> Key: HIVE-1695
> URL: https://issues.apache.org/jira/browse/HIVE-1695
> Project: Hive
> Issue Type: Improvement
> Components: Query Processor
> Reporter: Amareshwari Sriramadasu
>
> Currently MapJoin followed by ReduceSink runs as two MapReduce jobs : One map
> only job followed by a Map-Reduce job. It can be combined into single
> MapReduce Job.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.