[jira] [Commented] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

Phabricator (JIRA) Wed, 31 Jul 2013 15:20:02 -0700

    [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13725783#comment-13725783
 ]


Phabricator commented on HIVE-4968:
-----------------------------------

ashutoshc has accepted the revision "HIVE-4968 [jira] When deduplicate multiple 
SelectOperators, we should update RowResolver accordinly".

  Looks good. Some minor comments.

INLINE COMMENTS
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java:399 You are not 
using this method. Lets not add this.
  
ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java:104 
Can you add a comment saying something like we need to set row resolver of 
parent from the child which is in parse context to preserve column mappings.
  Feel free to improve on the wording here.

REVISION DETAIL
  https://reviews.facebook.net/D11901

BRANCH
  HIVE-4968

ARCANIST PROJECT
  hive

To: JIRA, ashutoshc, yhuai

                
> When deduplicating multiple SelectOperators, we should update RowResolver 
> accordinly
> ------------------------------------------------------------------------------------
>
>                 Key: HIVE-4968
>                 URL: https://issues.apache.org/jira/browse/HIVE-4968
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>         Attachments: HIVE-4968.D11901.1.patch
>
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>       FROM (SELECT key, value
>             FROM src) tmp1
>       JOIN (SELECT count(*) as count
>             FROM src) tmp2
>       ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>       FROM (SELECT *
>             FROM src) tmp1
>       JOIN (SELECT count(*) as count
>             FROM src) tmp2
>       ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
>       FROM (SELECT *
>             FROM (SELECT key, value
>                   FROM src) tmp1 ) tmp2
>       JOIN (SELECT count(*) as count
>             FROM src) tmp3
>       ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
>  Stage: Stage-5
>     Map Reduce Local Work
>       Alias -> Map Local Tables:
>         tmp4:tmp2:tmp1:src 
>           Fetch Operator
>             limit: -1
>       Alias -> Map Local Operator Tree:
>         tmp4:tmp2:tmp1:src 
>           TableScan
>             alias: src
>             Select Operator
>               expressions:
>                     expr: key
>                     type: string
>                     expr: value
>                     type: string
>               outputColumnNames: _col0, _col1
>               HashTable Sink Operator
>                 condition expressions:
>                   0 
>                   1 {_col0}
>                 handleSkewJoin: false
>                 keys:
>                   0 []
>                   1 []
>                 Position of Big Table: 1
>   Stage: Stage-4
>     Map Reduce
>       Alias -> Map Operator Tree:
>         $INTNAME 
>             Map Join Operator
>               condition map:
>                    Inner Join 0 to 1
>               condition expressions:
>                 0 
>                 1 {_col0}
>               handleSkewJoin: false
>               keys:
>                 0 []
>                 1 []
>               outputColumnNames: _col2
>               Position of Big Table: 1
>               Select Operator
>                 expressions:
>                       expr: _col0
>                       type: string
>                       expr: _col1
>                       type: string
>                       expr: _col2
>                       type: bigint
>                 outputColumnNames: _col0, _col1, _col2
>                 File Output Operator
>                   compressed: false
>                   GlobalTableId: 0
>                   table:
>                       input format: org.apache.hadoop.mapred.TextInputFormat
>                       output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>       Local Work:
>         Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
> _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (HIVE-4968) When deduplicating multiple SelectOperators, we should update RowResolver accordinly

Reply via email to