[
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Phabricator updated HIVE-4968:
------------------------------
Attachment: HIVE-4968.D11901.1.patch
yhuai requested code review of "HIVE-4968 [jira] When deduplicate multiple
SelectOperators, we should update RowResolver accordinly".
Reviewers: JIRA
Merge remote-tracking branch 'origin/trunk' into HIVE-4968
SELECT tmp3.key, tmp3.value, tmp3.count
FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
FROM (SELECT key, value
FROM src) tmp1
JOIN (SELECT count(*) as count
FROM src) tmp2
) tmp3;
The plan is executable.
SELECT tmp3.key, tmp3.value, tmp3.count
FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
FROM (SELECT *
FROM src) tmp1
JOIN (SELECT count(*) as count
FROM src) tmp2
) tmp3;
The plan is executable.
SELECT tmp4.key, tmp4.value, tmp4.count
FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
FROM (SELECT *
FROM (SELECT key, value
FROM src) tmp1 ) tmp2
JOIN (SELECT count(*) as count
FROM src) tmp3
) tmp4;
The plan is not executable.
The plan related to the MapJoin is
Stage: Stage-5
Map Reduce Local Work
Alias -> Map Local Tables:
tmp4:tmp2:tmp1:src
Fetch Operator
limit: -1
Alias -> Map Local Operator Tree:
tmp4:tmp2:tmp1:src
TableScan
alias: src
Select Operator
expressions:
expr: key
type: string
expr: value
type: string
outputColumnNames: _col0, _col1
HashTable Sink Operator
condition expressions:
0
1 {_col0}
handleSkewJoin: false
keys:
0 []
1 []
Position of Big Table: 1
Stage: Stage-4
Map Reduce
Alias -> Map Operator Tree:
$INTNAME
Map Join Operator
condition map:
Inner Join 0 to 1
condition expressions:
0
1 {_col0}
handleSkewJoin: false
keys:
0 []
1 []
outputColumnNames: _col2
Position of Big Table: 1
Select Operator
expressions:
expr: _col0
type: string
expr: _col1
type: string
expr: _col2
type: bigint
outputColumnNames: _col0, _col1, _col2
File Output Operator
compressed: false
GlobalTableId: 0
table:
input format: org.apache.hadoop.mapred.TextInputFormat
output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
Local Work:
Map Reduce Local Work
The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1,
_col2'
TEST PLAN
EMPTY
REVISION DETAIL
https://reviews.facebook.net/D11901
AFFECTED FILES
ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java
ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java
ql/src/test/queries/clientpositive/nonblock_op_deduplicate.q
ql/src/test/results/clientpositive/nonblock_op_deduplicate.q.out
MANAGE HERALD RULES
https://reviews.facebook.net/herald/view/differential/
WHY DID I GET THIS EMAIL?
https://reviews.facebook.net/herald/transcript/28407/
To: JIRA, yhuai
> When deduplicate multiple SelectOperators, we should update RowResolver
> accordinly
> ----------------------------------------------------------------------------------
>
> Key: HIVE-4968
> URL: https://issues.apache.org/jira/browse/HIVE-4968
> Project: Hive
> Issue Type: Bug
> Reporter: Yin Huai
> Assignee: Yin Huai
> Attachments: HIVE-4968.D11901.1.patch
>
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
> FROM (SELECT key, value
> FROM src) tmp1
> JOIN (SELECT count(*) as count
> FROM src) tmp2
> ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
> FROM (SELECT *
> FROM src) tmp1
> JOIN (SELECT count(*) as count
> FROM src) tmp2
> ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
> FROM (SELECT *
> FROM (SELECT key, value
> FROM src) tmp1 ) tmp2
> JOIN (SELECT count(*) as count
> FROM src) tmp3
> ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
> Stage: Stage-5
> Map Reduce Local Work
> Alias -> Map Local Tables:
> tmp4:tmp2:tmp1:src
> Fetch Operator
> limit: -1
> Alias -> Map Local Operator Tree:
> tmp4:tmp2:tmp1:src
> TableScan
> alias: src
> Select Operator
> expressions:
> expr: key
> type: string
> expr: value
> type: string
> outputColumnNames: _col0, _col1
> HashTable Sink Operator
> condition expressions:
> 0
> 1 {_col0}
> handleSkewJoin: false
> keys:
> 0 []
> 1 []
> Position of Big Table: 1
> Stage: Stage-4
> Map Reduce
> Alias -> Map Operator Tree:
> $INTNAME
> Map Join Operator
> condition map:
> Inner Join 0 to 1
> condition expressions:
> 0
> 1 {_col0}
> handleSkewJoin: false
> keys:
> 0 []
> 1 []
> outputColumnNames: _col2
> Position of Big Table: 1
> Select Operator
> expressions:
> expr: _col0
> type: string
> expr: _col1
> type: string
> expr: _col2
> type: bigint
> outputColumnNames: _col0, _col1, _col2
> File Output Operator
> compressed: false
> GlobalTableId: 0
> table:
> input format: org.apache.hadoop.mapred.TextInputFormat
> output format:
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
> Local Work:
> Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1,
> _col2'
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira