[ 
https://issues.apache.org/jira/browse/HIVE-3784?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13564148#comment-13564148
 ] 

Ashutosh Chauhan commented on HIVE-3784:
----------------------------------------

Had an offline review with Namit. Two major points came out:
* Consider a case of join followed by group-by. In such a case if Join gets 
converted into map-join than subsequent group-by can be pushed into first MR 
job, instead of executing it in second MR. Will be done in follow-up jira: 
HIVE-3952
* Instead of specifying size corresponding to 1 table, it will be better to 
specify combined size for all tables as a threshold.

Namit, I had other code level comments which I added on Phabricator. 
                
> de-emphasize mapjoin hint
> -------------------------
>
>                 Key: HIVE-3784
>                 URL: https://issues.apache.org/jira/browse/HIVE-3784
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Namit Jain
>            Assignee: Namit Jain
>         Attachments: hive.3784.10.patch, hive.3784.11.patch, 
> hive.3784.12.patch, hive.3784.13.patch, hive.3784.14.patch, 
> hive.3784.15.patch, hive.3784.16.patch, hive.3784.17.patch, 
> hive.3784.18.patch, hive.3784.19.patch, hive.3784.1.patch, hive.3784.2.patch, 
> hive.3784.3.patch, hive.3784.4.patch, hive.3784.5.patch, hive.3784.6.patch, 
> hive.3784.7.patch, hive.3784.8.patch, hive.3784.9.patch
>
>
> hive.auto.convert.join has been around for a long time, and is pretty stable.
> When mapjoin hint was created, the above parameter did not exist.
> The only reason for the user to specify a mapjoin currently is if they want
> it to be converted to a bucketed-mapjoin or a sort-merge bucketed mapjoin.
> Eventually, that should also go away, but that may take some time to 
> stabilize.
> There are many rules in SemanticAnalyzer to handle the following trees:
> ReduceSink -> MapJoin
> Union      -> MapJoin
> MapJoin    -> MapJoin
> This should not be supported anymore. In any of the above scenarios, the
> user can get the mapjoin behavior by setting hive.auto.convert.join to true
> and not specifying the hint. This will simplify the code a lot.
> What does everyone think ?

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to