[ 
https://issues.apache.org/jira/browse/HIVE-1194?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12838113#action_12838113
 ] 

Namit Jain commented on HIVE-1194:
----------------------------------

Based on a offline discussion with Yongqiang, we were thinking of the following:


There will be a new mapping in MapredWork ->
Operator -> MapredLocalWork

This will be populated for SortMergeJoinOperator only.

SortMergeJoinOperator is a new operator which extends MapJoinOperator, and has 
the
same name as a MapJoinOperator.

MapJoinProcessor needs to create a SortMergeJoinOperator instead of a 
MapJoinOperator
when it sees the new configuration parameter.

MapJoinFactory methods need to change to create Operator->MapredLocalWork 
instead of
MapredLocalWork in MapredWork.

> sorted merge join
> -----------------
>
>                 Key: HIVE-1194
>                 URL: https://issues.apache.org/jira/browse/HIVE-1194
>             Project: Hadoop Hive
>          Issue Type: New Feature
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: He Yongqiang
>             Fix For: 0.6.0
>
>
> If the input tables are sorted on the join key, and a mapjoin is being 
> performed, it is useful to exploit the sorted properties of the table.
> This can lead to substantial cpu savings - this needs to work across bucketed 
> map joins also.
> Since, sorted properties of a table are not enforced currently, a new 
> parameter can be added to specify to use the sort-merge join.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to