Navis created HIVE-6144: --------------------------- Summary: Implement non-staged MapJoin Key: HIVE-6144 URL: https://issues.apache.org/jira/browse/HIVE-6144 Project: Hive Issue Type: Improvement Components: Query Processor Reporter: Navis Assignee: Navis Priority: Minor
For map join, all data in small aliases are hashed and stored into temporary file in MapRedLocalTask. But for some aliases without filter or projection, it seemed not necessary to do that. For example. {noformat} select a.* from src a join src b on a.key=b.key; {noformat} makes plan like this. {noformat} STAGE PLANS: Stage: Stage-4 Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a HashTable Sink Operator condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] Position of Big Table: 1 Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: b TableScan alias: b Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0, _col1 Position of Big Table: 1 Select Operator File Output Operator Local Work: Map Reduce Local Work Stage: Stage-0 Fetch Operator {noformat} table src(a) is fetched and stored as-is in MRLocalTask. With this patch, plan can be like below. {noformat} Stage: Stage-3 Map Reduce Alias -> Map Operator Tree: b TableScan alias: b Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 {key} {value} 1 handleSkewJoin: false keys: 0 [Column[key]] 1 [Column[key]] outputColumnNames: _col0, _col1 Position of Big Table: 1 Select Operator File Output Operator Local Work: Map Reduce Local Work Alias -> Map Local Tables: a Fetch Operator limit: -1 Alias -> Map Local Operator Tree: a TableScan alias: a Has Any Stage Alias: false Stage: Stage-0 Fetch Operator {noformat} -- This message was sent by Atlassian JIRA (v6.1.5#6160)