[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly
[ https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4968: --- Summary: When deduplicating multiple SelectOperators, we should update RowResolver accordinly (was: When deduplicate multiple SelectOperators, we should update RowResolver accordinly) When deduplicating multiple SelectOperators, we should update RowResolver accordinly Key: HIVE-4968 URL: https://issues.apache.org/jira/browse/HIVE-4968 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4968.D11901.1.patch {code:Sql} SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT key, value FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; {\code} The plan is executable. {code:sql} SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT * FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; {\code} The plan is executable. {code} SELECT tmp4.key, tmp4.value, tmp4.count FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count FROM (SELECT * FROM (SELECT key, value FROM src) tmp1 ) tmp2 JOIN (SELECT count(*) as count FROM src) tmp3 ) tmp4; {\code} The plan is not executable. The plan related to the MapJoin is {code} Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: tmp4:tmp2:tmp1:src Fetch Operator limit: -1 Alias - Map Local Operator Tree: tmp4:tmp2:tmp1:src TableScan alias: src Select Operator expressions: expr: key type: string expr: value type: string outputColumnNames: _col0, _col1 HashTable Sink Operator condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 1 Stage: Stage-4 Map Reduce Alias - Map Operator Tree: $INTNAME Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col2 Position of Big Table: 1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Local Work: Map Reduce Local Work {\code} The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, _col2' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly
[ https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4968: --- Status: Open (was: Patch Available) When deduplicating multiple SelectOperators, we should update RowResolver accordinly Key: HIVE-4968 URL: https://issues.apache.org/jira/browse/HIVE-4968 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4968.D11901.1.patch {code:Sql} SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT key, value FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; {\code} The plan is executable. {code:sql} SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT * FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; {\code} The plan is executable. {code} SELECT tmp4.key, tmp4.value, tmp4.count FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count FROM (SELECT * FROM (SELECT key, value FROM src) tmp1 ) tmp2 JOIN (SELECT count(*) as count FROM src) tmp3 ) tmp4; {\code} The plan is not executable. The plan related to the MapJoin is {code} Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: tmp4:tmp2:tmp1:src Fetch Operator limit: -1 Alias - Map Local Operator Tree: tmp4:tmp2:tmp1:src TableScan alias: src Select Operator expressions: expr: key type: string expr: value type: string outputColumnNames: _col0, _col1 HashTable Sink Operator condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 1 Stage: Stage-4 Map Reduce Alias - Map Operator Tree: $INTNAME Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col2 Position of Big Table: 1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Local Work: Map Reduce Local Work {\code} The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, _col2' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly
[ https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Phabricator updated HIVE-4968: -- Attachment: HIVE-4968.D11901.2.patch yhuai updated the revision HIVE-4968 [jira] When deduplicate multiple SelectOperators, we should update RowResolver accordinly. addressed Ashutosh's comments Reviewers: ashutoshc, JIRA REVISION DETAIL https://reviews.facebook.net/D11901 CHANGE SINCE LAST DIFF https://reviews.facebook.net/D11901?vs=36669id=36693#toc BRANCH HIVE-4968 ARCANIST PROJECT hive AFFECTED FILES ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java ql/src/test/queries/clientpositive/nonblock_op_deduplicate.q ql/src/test/results/clientpositive/nonblock_op_deduplicate.q.out To: JIRA, ashutoshc, yhuai When deduplicating multiple SelectOperators, we should update RowResolver accordinly Key: HIVE-4968 URL: https://issues.apache.org/jira/browse/HIVE-4968 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4968.D11901.1.patch, HIVE-4968.D11901.2.patch {code:Sql} SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT key, value FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; {\code} The plan is executable. {code:sql} SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT * FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; {\code} The plan is executable. {code} SELECT tmp4.key, tmp4.value, tmp4.count FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count FROM (SELECT * FROM (SELECT key, value FROM src) tmp1 ) tmp2 JOIN (SELECT count(*) as count FROM src) tmp3 ) tmp4; {\code} The plan is not executable. The plan related to the MapJoin is {code} Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: tmp4:tmp2:tmp1:src Fetch Operator limit: -1 Alias - Map Local Operator Tree: tmp4:tmp2:tmp1:src TableScan alias: src Select Operator expressions: expr: key type: string expr: value type: string outputColumnNames: _col0, _col1 HashTable Sink Operator condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 1 Stage: Stage-4 Map Reduce Alias - Map Operator Tree: $INTNAME Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col2 Position of Big Table: 1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Local Work: Map Reduce Local Work {\code} The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, _col2' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira
[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly
[ https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Yin Huai updated HIVE-4968: --- Status: Patch Available (was: Open) addressed Ashutosh's comments When deduplicating multiple SelectOperators, we should update RowResolver accordinly Key: HIVE-4968 URL: https://issues.apache.org/jira/browse/HIVE-4968 Project: Hive Issue Type: Bug Reporter: Yin Huai Assignee: Yin Huai Attachments: HIVE-4968.D11901.1.patch, HIVE-4968.D11901.2.patch {code:Sql} SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT key, value FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; {\code} The plan is executable. {code:sql} SELECT tmp3.key, tmp3.value, tmp3.count FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count FROM (SELECT * FROM src) tmp1 JOIN (SELECT count(*) as count FROM src) tmp2 ) tmp3; {\code} The plan is executable. {code} SELECT tmp4.key, tmp4.value, tmp4.count FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count FROM (SELECT * FROM (SELECT key, value FROM src) tmp1 ) tmp2 JOIN (SELECT count(*) as count FROM src) tmp3 ) tmp4; {\code} The plan is not executable. The plan related to the MapJoin is {code} Stage: Stage-5 Map Reduce Local Work Alias - Map Local Tables: tmp4:tmp2:tmp1:src Fetch Operator limit: -1 Alias - Map Local Operator Tree: tmp4:tmp2:tmp1:src TableScan alias: src Select Operator expressions: expr: key type: string expr: value type: string outputColumnNames: _col0, _col1 HashTable Sink Operator condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] Position of Big Table: 1 Stage: Stage-4 Map Reduce Alias - Map Operator Tree: $INTNAME Map Join Operator condition map: Inner Join 0 to 1 condition expressions: 0 1 {_col0} handleSkewJoin: false keys: 0 [] 1 [] outputColumnNames: _col2 Position of Big Table: 1 Select Operator expressions: expr: _col0 type: string expr: _col1 type: string expr: _col2 type: bigint outputColumnNames: _col0, _col1, _col2 File Output Operator compressed: false GlobalTableId: 0 table: input format: org.apache.hadoop.mapred.TextInputFormat output format: org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat Local Work: Map Reduce Local Work {\code} The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, _col2' -- This message is automatically generated by JIRA. If you think it was sent incorrectly, please contact your JIRA administrators For more information on JIRA, see: http://www.atlassian.com/software/jira