Navis created HIVE-6144:
---------------------------
Summary: Implement non-staged MapJoin
Key: HIVE-6144
URL: https://issues.apache.org/jira/browse/HIVE-6144
Project: Hive
Issue Type: Improvement
Components: Query Processor
Reporter: Navis
Assignee: Navis
Priority: Minor
For map join, all data in small aliases are hashed and stored into temporary
file in MapRedLocalTask. But for some aliases without filter or projection, it
seemed not necessary to do that. For example.
{noformat}
select a.* from src a join src b on a.key=b.key;
{noformat}
makes plan like this.
{noformat}
STAGE PLANS:
Stage: Stage-4
Map Reduce Local Work
Alias -> Map Local Tables:
a
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
a
TableScan
alias: a
HashTable Sink Operator
condition expressions:
0 {key} {value}
1
handleSkewJoin: false
keys:
0 [Column[key]]
1 [Column[key]]
Position of Big Table: 1
Stage: Stage-3
Map Reduce
Alias -> Map Operator Tree:
b
TableScan
alias: b
Map Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {key} {value}
1
handleSkewJoin: false
keys:
0 [Column[key]]
1 [Column[key]]
outputColumnNames: _col0, _col1
Position of Big Table: 1
Select Operator
File Output Operator
Local Work:
Map Reduce Local Work
Stage: Stage-0
Fetch Operator
{noformat}
table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan
can be like below.
{noformat}
Stage: Stage-3
Map Reduce
Alias -> Map Operator Tree:
b
TableScan
alias: b
Map Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0 {key} {value}
1
handleSkewJoin: false
keys:
0 [Column[key]]
1 [Column[key]]
outputColumnNames: _col0, _col1
Position of Big Table: 1
Select Operator
File Output Operator
Local Work:
Map Reduce Local Work
Alias -> Map Local Tables:
a
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
a
TableScan
alias: a
Has Any Stage Alias: false
Stage: Stage-0
Fetch Operator
{noformat}
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)