[ 
https://issues.apache.org/jira/browse/HIVE-1678?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12916794#action_12916794
 ] 

Amareshwari Sriramadasu commented on HIVE-1678:
-----------------------------------------------

The same query succeeds when no MapJoin is used.

Looks like plan generation went wrong in MapJoinProcessor. explain output for 
the query:
{noformat}
explain 
select /*+MAPJOIN(src, myinput1) */ count(srcpart.key) from srcpart join src on 
(srcpart.value=src.value) join myinput1 on (srcpart.key=myinput1.key);

OK
ABSTRACT SYNTAX TREE:
  (TOK_QUERY (TOK_FROM (TOK_JOIN (TOK_JOIN (TOK_TABREF srcpart) (TOK_TABREF 
src) (= (. (TOK_TABLE_OR_COL srcpart) value) (. (TOK_TABLE_OR_COL src) value))) 
(TOK_TABREF myinput1) (= (. (TOK_TABLE_OR_COL srcpart) key) (. 
(TOK_TABLE_OR_COL myinput1) key)))) (TOK_INSERT (TOK_DESTINATION (TOK_DIR 
TOK_TMP_FILE)) (TOK_SELECT (TOK_HINTLIST (TOK_HINT TOK_MAPJOIN (TOK_HINTARGLIST 
src myinput1))) (TOK_SELEXPR (TOK_FUNCTION count (. (TOK_TABLE_OR_COL srcpart) 
key))))))

STAGE DEPENDENCIES:
  Stage-1 is a root stage
  Stage-2 depends on stages: Stage-1
  Stage-3 depends on stages: Stage-2
  Stage-0 is a root stage

STAGE PLANS:
  Stage: Stage-1
    Map Reduce
      Alias -> Map Operator Tree:
        srcpart
          TableScan
            alias: srcpart
            Common Join Operator
              condition map:
                   Inner Join 0 to 1
              condition expressions:
                0 {key}
                1
              handleSkewJoin: false
              keys:
                0 [Column[value]]
                1 [Column[value]]
              outputColumnNames: _col0
              Position of Big Table: 0
              File Output Operator
                compressed: false
                GlobalTableId: 0
                table:
                    input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                    output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
      Local Work:
        Map Reduce Local Work
          Alias -> Map Local Tables:
            src
              Fetch Operator
                limit: -1
          Alias -> Map Local Operator Tree:
            src
              TableScan
                alias: src
                Common Join Operator
                  condition map:
                       Inner Join 0 to 1
                  condition expressions:
                    0 {key}
                    1
                  handleSkewJoin: false
                  keys:
                    0 [Column[value]]
                    1 [Column[value]]
                  outputColumnNames: _col0
                  Position of Big Table: 0
                  File Output Operator
                    compressed: false
                    GlobalTableId: 0
                    table:
                        input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                        output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat

  Stage: Stage-2
    Map Reduce
      Alias -> Map Operator Tree:
        
hdfs://localhost:19000/tmp/hive-amarsri/hive_2010-10-01_11-24-15_198_629889728835692043/-mr-10002
          Select Operator
            expressions:
                  expr: _col0
                  type: int
            outputColumnNames: _col0
            Common Join Operator
              condition map:
                   Inner Join 0 to 1
              condition expressions:
                0 {_col0}
                1
              handleSkewJoin: false
              keys:
                0 [Column[_col0]]
                1 [Column[key]]
              outputColumnNames: _col0
              Position of Big Table: 0
              File Output Operator
                compressed: false
                GlobalTableId: 0
                table:
                    input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                    output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
      Local Work:
        Map Reduce Local Work
          Alias -> Map Local Tables:
            myinput1
              Fetch Operator
                limit: -1
          Alias -> Map Local Operator Tree:
            myinput1
              TableScan
                alias: myinput1
                Common Join Operator
                  condition map:
                       Inner Join 0 to 1
                  condition expressions:
                    0 {_col0}
                    1
                  handleSkewJoin: false
                  keys:
                    0 [Column[_col0]]
                    1 [Column[key]]
                  outputColumnNames: _col0
                  Position of Big Table: 0
                  File Output Operator
                    compressed: false
                    GlobalTableId: 0
                    table:
                        input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                        output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat

  Stage: Stage-3
    Map Reduce
      Alias -> Map Operator Tree:
        
hdfs://localhost:19000/tmp/hive-amarsri/hive_2010-10-01_11-24-15_198_629889728835692043/-mr-10002
          Select Operator
            expressions:
                  expr: _col0
                  type: int
            outputColumnNames: _col0
            Common Join Operator
              condition map:
                   Inner Join 0 to 1
              condition expressions:
                0 {_col0}
                1
              handleSkewJoin: false
              keys:
                0 [Column[_col0]]
                1 [Column[key]]
              outputColumnNames: _col0
              Position of Big Table: 0
              File Output Operator
                compressed: false
                GlobalTableId: 0
                table:
                    input format: 
org.apache.hadoop.mapred.SequenceFileInputFormat
                    output format: 
org.apache.hadoop.hive.ql.io.HiveSequenceFileOutputFormat
      Reduce Operator Tree:
        Group By Operator
          aggregations:
                expr: count(VALUE._col0)
          bucketGroup: false
          mode: mergepartial
          outputColumnNames: _col0
          Select Operator
            expressions:
                  expr: _col0
                  type: bigint
            outputColumnNames: _col0
            File Output Operator
              compressed: false
              GlobalTableId: 0
              table:
                  input format: org.apache.hadoop.mapred.TextInputFormat
                  output format: 
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

  Stage: Stage-0
    Fetch Operator
      limit: -1

Time taken: 0.202 seconds
{noformat}

If I'm not wrong, Join operation should not be there in 3rd stage.

> NPE in MapJoin 
> ---------------
>
>                 Key: HIVE-1678
>                 URL: https://issues.apache.org/jira/browse/HIVE-1678
>             Project: Hadoop Hive
>          Issue Type: Bug
>          Components: Query Processor
>            Reporter: Amareshwari Sriramadasu
>
> The query with two map joins and a group by fails with following NPE:
> Caused by: java.lang.NullPointerException
>         at 
> org.apache.hadoop.hive.ql.exec.MapJoinOperator.processOp(MapJoinOperator.java:177)
>         at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>         at 
> org.apache.hadoop.hive.ql.exec.SelectOperator.processOp(SelectOperator.java:84)
>         at org.apache.hadoop.hive.ql.exec.Operator.process(Operator.java:457)
>         at org.apache.hadoop.hive.ql.exec.Operator.forward(Operator.java:697)
>         at 
> org.apache.hadoop.hive.ql.exec.MapOperator.process(MapOperator.java:464)

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to