[ 
https://issues.apache.org/jira/browse/HIVE-6144?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13871562#comment-13871562
 ] 

Vikram Dixit K commented on HIVE-6144:
--------------------------------------

Hi Navis,

It still looks like the flag is off by default. I think it is useful to have it 
on by default. Any concerns?

Thanks
Vikram.

> Implement non-staged MapJoin
> ----------------------------
>
>                 Key: HIVE-6144
>                 URL: https://issues.apache.org/jira/browse/HIVE-6144
>             Project: Hive
>          Issue Type: Improvement
>          Components: Query Processor
>            Reporter: Navis
>            Assignee: Navis
>            Priority: Minor
>         Attachments: HIVE-6144.1.patch.txt, HIVE-6144.2.patch.txt, 
> HIVE-6144.3.patch.txt
>
>
> For map join, all data in small aliases are hashed and stored into temporary 
> file in MapRedLocalTask. But for some aliases without filter or projection, 
> it seemed not necessary to do that. For example.
> {noformat}
> select a.* from src a join src b on a.key=b.key;
> {noformat}
> makes plan like this.
> {noformat}
> STAGE PLANS:
>   Stage: Stage-4
>     Map Reduce Local Work
>       Alias -> Map Local Tables:
>         a 
>           Fetch Operator
>             limit: -1
>       Alias -> Map Local Operator Tree:
>         a 
>           TableScan
>             alias: a
>             HashTable Sink Operator
>               condition expressions:
>                 0 {key} {value}
>                 1 
>               handleSkewJoin: false
>               keys:
>                 0 [Column[key]]
>                 1 [Column[key]]
>               Position of Big Table: 1
>   Stage: Stage-3
>     Map Reduce
>       Alias -> Map Operator Tree:
>         b 
>           TableScan
>             alias: b
>             Map Join Operator
>               condition map:
>                    Inner Join 0 to 1
>               condition expressions:
>                 0 {key} {value}
>                 1 
>               handleSkewJoin: false
>               keys:
>                 0 [Column[key]]
>                 1 [Column[key]]
>               outputColumnNames: _col0, _col1
>               Position of Big Table: 1
>               Select Operator
>                 File Output Operator
>       Local Work:
>         Map Reduce Local Work
>   Stage: Stage-0
>     Fetch Operator
> {noformat}
> table src(a) is fetched and stored as-is in MRLocalTask. With this patch, 
> plan can be like below.
> {noformat}
>   Stage: Stage-3
>     Map Reduce
>       Alias -> Map Operator Tree:
>         b 
>           TableScan
>             alias: b
>             Map Join Operator
>               condition map:
>                    Inner Join 0 to 1
>               condition expressions:
>                 0 {key} {value}
>                 1 
>               handleSkewJoin: false
>               keys:
>                 0 [Column[key]]
>                 1 [Column[key]]
>               outputColumnNames: _col0, _col1
>               Position of Big Table: 1
>               Select Operator
>                   File Output Operator
>       Local Work:
>         Map Reduce Local Work
>           Alias -> Map Local Tables:
>             a 
>               Fetch Operator
>                 limit: -1
>           Alias -> Map Local Operator Tree:
>             a 
>               TableScan
>                 alias: a
>           Has Any Stage Alias: false
>   Stage: Stage-0
>     Fetch Operator
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to