[jira] [Updated] (HIVE-4968) When deduplicate multiple SelectOperators, we should update RowResolver accordinly

Phabricator (JIRA) Wed, 31 Jul 2013 14:08:06 -0700

     [ 
https://issues.apache.org/jira/browse/HIVE-4968?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]


Phabricator updated HIVE-4968:
------------------------------

    Attachment: HIVE-4968.D11901.1.patch

yhuai requested code review of "HIVE-4968 [jira] When deduplicate multiple 
SelectOperators, we should update RowResolver accordinly".

Reviewers: JIRA

Merge remote-tracking branch 'origin/trunk' into HIVE-4968

SELECT tmp3.key, tmp3.value, tmp3.count
FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
      FROM (SELECT key, value
            FROM src) tmp1
      JOIN (SELECT count(*) as count
            FROM src) tmp2
      ) tmp3;

The plan is executable.

SELECT tmp3.key, tmp3.value, tmp3.count
FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
      FROM (SELECT *
            FROM src) tmp1
      JOIN (SELECT count(*) as count
            FROM src) tmp2
      ) tmp3;

The plan is executable.

SELECT tmp4.key, tmp4.value, tmp4.count
FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
      FROM (SELECT *
            FROM (SELECT key, value
                  FROM src) tmp1 ) tmp2
      JOIN (SELECT count(*) as count
            FROM src) tmp3
      ) tmp4;

The plan is not executable.

The plan related to the MapJoin is

 Stage: Stage-5
    Map Reduce Local Work
      Alias -> Map Local Tables:
        tmp4:tmp2:tmp1:src
          Fetch Operator
            limit: -1
      Alias -> Map Local Operator Tree:
        tmp4:tmp2:tmp1:src
          TableScan
            alias: src
            Select Operator
              expressions:
                    expr: key
                    type: string
                    expr: value
                    type: string
              outputColumnNames: _col0, _col1
              HashTable Sink Operator
                condition expressions:
                  0
                  1 {_col0}
                handleSkewJoin: false
                keys:
                  0 []
                  1 []
                Position of Big Table: 1

  Stage: Stage-4
    Map Reduce
      Alias -> Map Operator Tree:
        $INTNAME
            Map Join Operator
              condition map:
                   Inner Join 0 to 1
              condition expressions:
                0
                1 {_col0}
              handleSkewJoin: false
              keys:
                0 []
                1 []
              outputColumnNames: _col2
              Position of Big Table: 1
              Select Operator
                expressions:
                      expr: _col0
                      type: string
                      expr: _col1
                      type: string
                      expr: _col2
                      type: bigint
                outputColumnNames: _col0, _col1, _col2
                File Output Operator
                  compressed: false
                  GlobalTableId: 0
                  table:
                      input format: org.apache.hadoop.mapred.TextInputFormat
                      output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
      Local Work:
        Map Reduce Local Work

The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
_col2'

TEST PLAN
  EMPTY

REVISION DETAIL
  https://reviews.facebook.net/D11901

AFFECTED FILES
  ql/src/java/org/apache/hadoop/hive/ql/optimizer/NonBlockingOpDeDupProc.java
  ql/src/java/org/apache/hadoop/hive/ql/parse/ParseContext.java
  ql/src/test/queries/clientpositive/nonblock_op_deduplicate.q
  ql/src/test/results/clientpositive/nonblock_op_deduplicate.q.out

MANAGE HERALD RULES
  https://reviews.facebook.net/herald/view/differential/

WHY DID I GET THIS EMAIL?
  https://reviews.facebook.net/herald/transcript/28407/

To: JIRA, yhuai

                
> When deduplicate multiple SelectOperators, we should update RowResolver 
> accordinly
> ----------------------------------------------------------------------------------
>
>                 Key: HIVE-4968
>                 URL: https://issues.apache.org/jira/browse/HIVE-4968
>             Project: Hive
>          Issue Type: Bug
>            Reporter: Yin Huai
>            Assignee: Yin Huai
>         Attachments: HIVE-4968.D11901.1.patch
>
>
> {code:Sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>       FROM (SELECT key, value
>             FROM src) tmp1
>       JOIN (SELECT count(*) as count
>             FROM src) tmp2
>       ) tmp3;
> {\code}
> The plan is executable.
> {code:sql}
> SELECT tmp3.key, tmp3.value, tmp3.count
> FROM (SELECT tmp1.key as key, tmp1.value as value, tmp2.count as count
>       FROM (SELECT *
>             FROM src) tmp1
>       JOIN (SELECT count(*) as count
>             FROM src) tmp2
>       ) tmp3;
> {\code}
> The plan is executable.
> {code}
> SELECT tmp4.key, tmp4.value, tmp4.count
> FROM (SELECT tmp2.key as key, tmp2.value as value, tmp3.count as count
>       FROM (SELECT *
>             FROM (SELECT key, value
>                   FROM src) tmp1 ) tmp2
>       JOIN (SELECT count(*) as count
>             FROM src) tmp3
>       ) tmp4;
> {\code}
> The plan is not executable.
> The plan related to the MapJoin is
> {code}
>  Stage: Stage-5
>     Map Reduce Local Work
>       Alias -> Map Local Tables:
>         tmp4:tmp2:tmp1:src 
>           Fetch Operator
>             limit: -1
>       Alias -> Map Local Operator Tree:
>         tmp4:tmp2:tmp1:src 
>           TableScan
>             alias: src
>             Select Operator
>               expressions:
>                     expr: key
>                     type: string
>                     expr: value
>                     type: string
>               outputColumnNames: _col0, _col1
>               HashTable Sink Operator
>                 condition expressions:
>                   0 
>                   1 {_col0}
>                 handleSkewJoin: false
>                 keys:
>                   0 []
>                   1 []
>                 Position of Big Table: 1
>   Stage: Stage-4
>     Map Reduce
>       Alias -> Map Operator Tree:
>         $INTNAME 
>             Map Join Operator
>               condition map:
>                    Inner Join 0 to 1
>               condition expressions:
>                 0 
>                 1 {_col0}
>               handleSkewJoin: false
>               keys:
>                 0 []
>                 1 []
>               outputColumnNames: _col2
>               Position of Big Table: 1
>               Select Operator
>                 expressions:
>                       expr: _col0
>                       type: string
>                       expr: _col1
>                       type: string
>                       expr: _col2
>                       type: bigint
>                 outputColumnNames: _col0, _col1, _col2
>                 File Output Operator
>                   compressed: false
>                   GlobalTableId: 0
>                   table:
>                       input format: org.apache.hadoop.mapred.TextInputFormat
>                       output format: 
> org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat
>       Local Work:
>         Map Reduce Local Work
> {\code}
> The outputColumnNames of MapJoin is '_col2'. But it should be '_col0, _col1, 
> _col2'

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Updated] (HIVE-4968) When deduplicate multiple SelectOperators, we should update RowResolver accordinly

Reply via email to