[jira] [Updated] (HIVE-4968) When deduplicate multiple SelectOperators, we should update RowResolver accordinly
[ https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4968: --- Status: Patch Available (was: Open) > When deduplicate multiple SelectOperators, we should update RowResolver > accordinly > -- > > Key: HIVE-4968 > URL: https://issues.apache.org/jira/browse/HIVE-4968 > Project: Hive > Issue Type: Bug >Reporter: Yin Huai >Assignee: Yin Huai > Attachments: HIVE-4968.D11901.1.patch > > > {code:Sql} > SELECT tmp3.key, tmp3.value, tmp3.count > FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count > FROM (SELECT key, value > FROM src) tmp1 > JOIN (SELECT count(*) as count > FROM src) tmp2 > ) tmp3; > {\code} > The plan is executable. > {code:sql} > SELECT tmp3.key, tmp3.value, tmp3.count > FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count > FROM (SELECT * > FROM src) tmp1 > JOIN (SELECT count(*) as count > FROM src) tmp2 > ) tmp3; > {\code} > The plan is executable. > {code} > SELECT tmp4.key, tmp4.value, tmp4.count > FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count > FROM (SELECT * > FROM (SELECT key, value > FROM src) tmp1 ) tmp2 > JOIN (SELECT count(*) as count > FROM src) tmp3 > ) tmp4; > {\code} > The plan is not executable. > The plan related to the MapJoin is > {code} > Stage: Stage-5 > Map Reduce Local Work > Alias -> Map Local Tables: > tmp4:tmp2:tmp1:src > Fetch Operator > limit: -1 > Alias -> Map Local Operator Tree: > tmp4:tmp2:tmp1:src > TableScan > alias: src > Select Operator > expressions: > expr: key > type: string > expr: value > type: string > outputColumnNames: _col0, _col1 > HashTable Sink Operator > condition expressions: > 0 > 1 {_col0} > handleSkewJoin: false > keys: > 0 [] > 1 [] > Position of Big Table: 1 > Stage: Stage-4 > Map Reduce > Alias -> Map Operator Tree: > $INTNAME > Map Join Operator > condition map: >Inner Join 0 to 1 > condition expressions: > 0 > 1 {_col0} > handleSkewJoin: false > keys: > 0 [] > 1 [] > outputColumnNames: _col2 > Position of Big Table: 1 > Select Operator > expressions: > expr: _col0 > type: string > expr: _col1 > type: string > expr: _col2 > type: bigint > outputColumnNames: _col0, _col1, _col2 > File Output Operator > compressed: false > GlobalTableId: 0 > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Local Work: > Map Reduce Local Work > {\code} > The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, > _col2' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4968) When deduplicate multiple SelectOperators, we should update RowResolver accordinly
[ https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4968: -- Attachment: HIVE-4968.D11901.1.patch yhuai requested code review of "HIVE-4968 [jira] When deduplicate multiple SelectOperators, we should update RowResolver accordinly". Reviewers: JIRA Merge remote-tracking branch 'origin/trunk' into HIVE-4968 SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT key, value FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; The plan is executable. SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT * FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; The plan is executable. SELECT tmp4.key, tmp4.value, tmp4.count FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count FROM (SELECT * FROM (SELECT key, value FROM src) tmp1 ) tmp2 JOIN (SELECT count(*) as count FROM src) tmp3 ) tmp4; The plan is not executable. The plan related to the MapJoin is Stage: Stage-5 Map Reduce Local Work Alias -> Map Local Tables: tmp4:tmp2:tmp1:src Fetch Operator limit: -1 Alias -> Map Local Operator Tree: tmp4:tmp2:tmp1:src TableScan alias: src Select Operator expressions: expr: key type: string expr: value type: string outputColumnNames: _col0, _col1 HashTable Sink Operator condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 1 Stage: Stage-4 Map Reduce Alias -> Map Operator Tree: $INTNAME Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col2 Position of Big Table: 1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Local Work: Map Reduce Local Work The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, _col2' TEST PLAN EMPTY REVISION DETAIL https://reviews.facebook.net/D11901 AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java ql/src/test/queries/clientpositive/nonblock_op_deduplicate.q ql/src/test/results/clientpositive/nonblock_op_deduplicate.q.out MANAGE HERALD RULES https://reviews.facebook.net/herald/view/differential/ WHY DID I GET THIS EMAIL? https://reviews.facebook.net/herald/transcript/28407/ To: JIRA, yhuai > When deduplicate multiple SelectOperators, we should update RowResolver > accordinly > -- > > Key: HIVE-4968 > URL: https://issues.apache.org/jira/browse/HIVE-4968 > Project: Hive > Issue Type: Bug >Reporter: Yin Huai >Assignee: Yin Huai > Attachments: HIVE-4968.D11901.1.patch > > > {code:Sql} > SELECT tmp3.key, tmp3.value, tmp3.count > FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count > FROM (SELECT key, value > FROM src) tmp1 > JOIN (SELECT count(*) as count > FROM src) tmp2 > ) tmp3; > {\code} > The plan is executable. > {code:sql} > SELECT tmp3.key, tmp3.value, tmp3.count > FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count > FROM (SELECT * > FROM src) tmp1 > JOIN (SELECT count(*) as count > FROM src) tmp2 > ) tmp3; > {\code} > The plan is executable. > {code} > SELECT tmp4.key, tmp4.value, tmp4.count > FROM (SELECT tmp2.key as key, tmp2.val
[jira] [Updated] (HIVE-4968) When deduplicate multiple SelectOperators, we should update RowResolver accordinly
[ https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4968: --- Summary: When deduplicate multiple SelectOperators, we should update RowResolver accordinly (was: Broken plan in MapJoin) > When deduplicate multiple SelectOperators, we should update RowResolver > accordinly > -- > > Key: HIVE-4968 > URL: https://issues.apache.org/jira/browse/HIVE-4968 > Project: Hive > Issue Type: Bug >Reporter: Yin Huai >Assignee: Yin Huai > > {code:Sql} > SELECT tmp3.key, tmp3.value, tmp3.count > FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count > FROM (SELECT key, value > FROM src) tmp1 > JOIN (SELECT count(*) as count > FROM src) tmp2 > ) tmp3; > {\code} > The plan is executable. > {code:sql} > SELECT tmp3.key, tmp3.value, tmp3.count > FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count > FROM (SELECT * > FROM src) tmp1 > JOIN (SELECT count(*) as count > FROM src) tmp2 > ) tmp3; > {\code} > The plan is executable. > {code} > SELECT tmp4.key, tmp4.value, tmp4.count > FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count > FROM (SELECT * > FROM (SELECT key, value > FROM src) tmp1 ) tmp2 > JOIN (SELECT count(*) as count > FROM src) tmp3 > ) tmp4; > {\code} > The plan is not executable. > The plan related to the MapJoin is > {code} > Stage: Stage-5 > Map Reduce Local Work > Alias -> Map Local Tables: > tmp4:tmp2:tmp1:src > Fetch Operator > limit: -1 > Alias -> Map Local Operator Tree: > tmp4:tmp2:tmp1:src > TableScan > alias: src > Select Operator > expressions: > expr: key > type: string > expr: value > type: string > outputColumnNames: _col0, _col1 > HashTable Sink Operator > condition expressions: > 0 > 1 {_col0} > handleSkewJoin: false > keys: > 0 [] > 1 [] > Position of Big Table: 1 > Stage: Stage-4 > Map Reduce > Alias -> Map Operator Tree: > $INTNAME > Map Join Operator > condition map: >Inner Join 0 to 1 > condition expressions: > 0 > 1 {_col0} > handleSkewJoin: false > keys: > 0 [] > 1 [] > outputColumnNames: _col2 > Position of Big Table: 1 > Select Operator > expressions: > expr: _col0 > type: string > expr: _col1 > type: string > expr: _col2 > type: bigint > outputColumnNames: _col0, _col1, _col2 > File Output Operator > compressed: false > GlobalTableId: 0 > table: > input format: org.apache.hadoop.mapred.TextInputFormat > output format: > org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat > Local Work: > Map Reduce Local Work > {\code} > The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, > _col2' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira