[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-07-31 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4968:
---

Summary: When deduplicating multiple SelectOperators, we should update 
RowResolver accordinly  (was: When deduplicate multiple SelectOperators, we 
should update RowResolver accordinly)

 When deduplicating multiple SelectOperators, we should update RowResolver 
 accordinly
 

 Key: HIVE-4968
 URL: https://issues.apache.org/jira/browse/HIVE-4968
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4968.D11901.1.patch


 {code:Sql}
 SELECT tmp3.key, tmp3.value, tmp3.count
 FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
   FROM (SELECT key, value
 FROM src) tmp1
   JOIN (SELECT count(*) as count
 FROM src) tmp2
   ) tmp3;
 {\code}
 The plan is executable.
 {code:sql}
 SELECT tmp3.key, tmp3.value, tmp3.count
 FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
   FROM (SELECT *
 FROM src) tmp1
   JOIN (SELECT count(*) as count
 FROM src) tmp2
   ) tmp3;
 {\code}
 The plan is executable.
 {code}
 SELECT tmp4.key, tmp4.value, tmp4.count
 FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
   FROM (SELECT *
 FROM (SELECT key, value
   FROM src) tmp1 ) tmp2
   JOIN (SELECT count(*) as count
 FROM src) tmp3
   ) tmp4;
 {\code}
 The plan is not executable.
 The plan related to the MapJoin is
 {code}
  Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 tmp4:tmp2:tmp1:src 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 tmp4:tmp2:tmp1:src 
   TableScan
 alias: src
 Select Operator
   expressions:
 expr: key
 type: string
 expr: value
 type: string
   outputColumnNames: _col0, _col1
   HashTable Sink Operator
 condition expressions:
   0 
   1 {_col0}
 handleSkewJoin: false
 keys:
   0 []
   1 []
 Position of Big Table: 1
   Stage: Stage-4
 Map Reduce
   Alias - Map Operator Tree:
 $INTNAME 
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 
 1 {_col0}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   outputColumnNames: _col2
   Position of Big Table: 1
   Select Operator
 expressions:
   expr: _col0
   type: string
   expr: _col1
   type: string
   expr: _col2
   type: bigint
 outputColumnNames: _col0, _col1, _col2
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   Local Work:
 Map Reduce Local Work
 {\code}
 The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
 _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-07-31 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4968:
---

Status: Open  (was: Patch Available)

 When deduplicating multiple SelectOperators, we should update RowResolver 
 accordinly
 

 Key: HIVE-4968
 URL: https://issues.apache.org/jira/browse/HIVE-4968
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4968.D11901.1.patch


 {code:Sql}
 SELECT tmp3.key, tmp3.value, tmp3.count
 FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
   FROM (SELECT key, value
 FROM src) tmp1
   JOIN (SELECT count(*) as count
 FROM src) tmp2
   ) tmp3;
 {\code}
 The plan is executable.
 {code:sql}
 SELECT tmp3.key, tmp3.value, tmp3.count
 FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
   FROM (SELECT *
 FROM src) tmp1
   JOIN (SELECT count(*) as count
 FROM src) tmp2
   ) tmp3;
 {\code}
 The plan is executable.
 {code}
 SELECT tmp4.key, tmp4.value, tmp4.count
 FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
   FROM (SELECT *
 FROM (SELECT key, value
   FROM src) tmp1 ) tmp2
   JOIN (SELECT count(*) as count
 FROM src) tmp3
   ) tmp4;
 {\code}
 The plan is not executable.
 The plan related to the MapJoin is
 {code}
  Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 tmp4:tmp2:tmp1:src 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 tmp4:tmp2:tmp1:src 
   TableScan
 alias: src
 Select Operator
   expressions:
 expr: key
 type: string
 expr: value
 type: string
   outputColumnNames: _col0, _col1
   HashTable Sink Operator
 condition expressions:
   0 
   1 {_col0}
 handleSkewJoin: false
 keys:
   0 []
   1 []
 Position of Big Table: 1
   Stage: Stage-4
 Map Reduce
   Alias - Map Operator Tree:
 $INTNAME 
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 
 1 {_col0}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   outputColumnNames: _col2
   Position of Big Table: 1
   Select Operator
 expressions:
   expr: _col0
   type: string
   expr: _col1
   type: string
   expr: _col2
   type: bigint
 outputColumnNames: _col0, _col1, _col2
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   Local Work:
 Map Reduce Local Work
 {\code}
 The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
 _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-07-31 Thread Phabricator (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Phabricator updated HIVE-4968:
--

Attachment: HIVE-4968.D11901.2.patch

yhuai updated the revision HIVE-4968 [jira] When deduplicate multiple 
SelectOperators, we should update RowResolver accordinly.

  addressed Ashutosh's comments

Reviewers: ashutoshc, JIRA

REVISION DETAIL
  https://reviews.facebook.net/D11901

CHANGE SINCE LAST DIFF
  https://reviews.facebook.net/D11901?vs=36669id=36693#toc

BRANCH
  HIVE-4968

ARCANIST PROJECT
  hive

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java
  ql/src/test/queries/clientpositive/nonblock_op_deduplicate.q
  ql/src/test/results/clientpositive/nonblock_op_deduplicate.q.out

To: JIRA, ashutoshc, yhuai


 When deduplicating multiple SelectOperators, we should update RowResolver 
 accordinly
 

 Key: HIVE-4968
 URL: https://issues.apache.org/jira/browse/HIVE-4968
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4968.D11901.1.patch, HIVE-4968.D11901.2.patch


 {code:Sql}
 SELECT tmp3.key, tmp3.value, tmp3.count
 FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
   FROM (SELECT key, value
 FROM src) tmp1
   JOIN (SELECT count(*) as count
 FROM src) tmp2
   ) tmp3;
 {\code}
 The plan is executable.
 {code:sql}
 SELECT tmp3.key, tmp3.value, tmp3.count
 FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
   FROM (SELECT *
 FROM src) tmp1
   JOIN (SELECT count(*) as count
 FROM src) tmp2
   ) tmp3;
 {\code}
 The plan is executable.
 {code}
 SELECT tmp4.key, tmp4.value, tmp4.count
 FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
   FROM (SELECT *
 FROM (SELECT key, value
   FROM src) tmp1 ) tmp2
   JOIN (SELECT count(*) as count
 FROM src) tmp3
   ) tmp4;
 {\code}
 The plan is not executable.
 The plan related to the MapJoin is
 {code}
  Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 tmp4:tmp2:tmp1:src 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 tmp4:tmp2:tmp1:src 
   TableScan
 alias: src
 Select Operator
   expressions:
 expr: key
 type: string
 expr: value
 type: string
   outputColumnNames: _col0, _col1
   HashTable Sink Operator
 condition expressions:
   0 
   1 {_col0}
 handleSkewJoin: false
 keys:
   0 []
   1 []
 Position of Big Table: 1
   Stage: Stage-4
 Map Reduce
   Alias - Map Operator Tree:
 $INTNAME 
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 
 1 {_col0}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   outputColumnNames: _col2
   Position of Big Table: 1
   Select Operator
 expressions:
   expr: _col0
   type: string
   expr: _col1
   type: string
   expr: _col2
   type: bigint
 outputColumnNames: _col0, _col1, _col2
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   Local Work:
 Map Reduce Local Work
 {\code}
 The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
 _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira


[jira] [Updated] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

2013-07-31 Thread Yin Huai (JIRA)

 [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yin Huai updated HIVE-4968:
---

Status: Patch Available  (was: Open)

addressed Ashutosh's comments

 When deduplicating multiple SelectOperators, we should update RowResolver 
 accordinly
 

 Key: HIVE-4968
 URL: https://issues.apache.org/jira/browse/HIVE-4968
 Project: Hive
  Issue Type: Bug
Reporter: Yin Huai
Assignee: Yin Huai
 Attachments: HIVE-4968.D11901.1.patch, HIVE-4968.D11901.2.patch


 {code:Sql}
 SELECT tmp3.key, tmp3.value, tmp3.count
 FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
   FROM (SELECT key, value
 FROM src) tmp1
   JOIN (SELECT count(*) as count
 FROM src) tmp2
   ) tmp3;
 {\code}
 The plan is executable.
 {code:sql}
 SELECT tmp3.key, tmp3.value, tmp3.count
 FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
   FROM (SELECT *
 FROM src) tmp1
   JOIN (SELECT count(*) as count
 FROM src) tmp2
   ) tmp3;
 {\code}
 The plan is executable.
 {code}
 SELECT tmp4.key, tmp4.value, tmp4.count
 FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
   FROM (SELECT *
 FROM (SELECT key, value
   FROM src) tmp1 ) tmp2
   JOIN (SELECT count(*) as count
 FROM src) tmp3
   ) tmp4;
 {\code}
 The plan is not executable.
 The plan related to the MapJoin is
 {code}
  Stage: Stage-5
 Map Reduce Local Work
   Alias - Map Local Tables:
 tmp4:tmp2:tmp1:src 
   Fetch Operator
 limit: -1
   Alias - Map Local Operator Tree:
 tmp4:tmp2:tmp1:src 
   TableScan
 alias: src
 Select Operator
   expressions:
 expr: key
 type: string
 expr: value
 type: string
   outputColumnNames: _col0, _col1
   HashTable Sink Operator
 condition expressions:
   0 
   1 {_col0}
 handleSkewJoin: false
 keys:
   0 []
   1 []
 Position of Big Table: 1
   Stage: Stage-4
 Map Reduce
   Alias - Map Operator Tree:
 $INTNAME 
 Map Join Operator
   condition map:
Inner Join 0 to 1
   condition expressions:
 0 
 1 {_col0}
   handleSkewJoin: false
   keys:
 0 []
 1 []
   outputColumnNames: _col2
   Position of Big Table: 1
   Select Operator
 expressions:
   expr: _col0
   type: string
   expr: _col1
   type: string
   expr: _col2
   type: bigint
 outputColumnNames: _col0, _col1, _col2
 File Output Operator
   compressed: false
   GlobalTableId: 0
   table:
   input format: org.apache.hadoop.mapred.TextInputFormat
   output format: 
 org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
   Local Work:
 Map Reduce Local Work
 {\code}
 The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
 _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira