Hi,

I am trying to run a simple join query on hive 13.

Both tables are in text format. Both tables are read in mappers, and the
error is thrown in reducer. I don't get why a reducer is reading a table
when the mappers have read it already and the reason for assuming that the
video file is in SequenceFile format.

Below, you can find query, query plan, and the error. Any help will be
greatly appreciated.

Thanks,

Sid

*Hadoop Version:* 2.0.0-mr1

Query:

SELECT computerguid

FROM revenue_start_adeffx_v2

JOIN video

ON revenue_start_adeffx_v2.video_id = video.video_id

WHERE hourid = '389567';


Query Plan:

STAGE DEPENDENCIES:

  Stage-1 is a root stage

  Stage-0 is a root stage


STAGE PLANS:

  Stage: Stage-1

    Map Reduce

      Map Operator Tree:

          TableScan

            alias: revenue_start_adeffx_v2

            Statistics: Num rows: 3175840 Data size: 330287403 Basic stats:
COMPLETE Column stats: NONE

            Reduce Output Operator

              key expressions: video_id (type: int)

              sort order: +

              Map-reduce partition columns: video_id (type: int)

              Statistics: Num rows: 3175840 Data size: 330287403 Basic
stats: COMPLETE Column stats: NONE

              value expressions: computerguid (type: string)

          TableScan

            alias: video

            Statistics: Num rows: 146679792 Data size: 586719168 Basic
stats: COMPLETE Column stats: NONE

            Reduce Output Operator

              key expressions: video_id (type: int)

              sort order: +

              Map-reduce partition columns: video_id (type: int)

              Statistics: Num rows: 146679792 Data size: 586719168 Basic
stats: COMPLETE Column stats: NONE

      Reduce Operator Tree:

        Join Operator

          condition map:

               Inner Join 0 to 1

          condition expressions:

            0 {VALUE._col0}

            1

          outputColumnNames: _col0

          Statistics: Num rows: 161347776 Data size: 645391104 Basic stats:
COMPLETE Column stats: NONE

          Select Operator

            expressions: _col0 (type: string)

            outputColumnNames: _col0

            Statistics: Num rows: 161347776 Data size: 645391104 Basic
stats: COMPLETE Column stats: NONE

            File Output Operator

              compressed: false

              Statistics: Num rows: 161347776 Data size: 645391104 Basic
stats: COMPLETE Column stats: NONE

              table:

                  input format: org.apache.hadoop.mapred.TextInputFormat

                  output format:
org.apache.hadoop.hive.ql.io.HiveIgnoreKeyTextOutputFormat

                  serde: org.apache.hadoop.hive.serde2.lazy.LazySimpleSerDe


  Stage: Stage-0

    Fetch Operator

      limit: -1



Error:

2014-06-11 10:18:34,818 FATAL ExecReducer:
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
hdfs://<NN><Path>/video/video_20140611051139 not a SequenceFile

at
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:237)

at
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74)

at
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:644)

at
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)

at
org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)

at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:216)

at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)

at org.apache.hadoop.mapred.Child$4.run(Child.java:268)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

at org.apache.hadoop.mapred.Child.main(Child.java:262)

Caused by: java.io.IOException:
hdfs:/<NN><Path>/hive/warehouse/video/video_20140611051139 not a
SequenceFile

at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1805)

at
org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1765)

at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1714)

at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1728)

at
org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:43)

at
org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:59)

at
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:226)

... 12 more


2014-06-11 10:18:34,822 INFO org.apache.hadoop.mapred.TaskLogsTruncater:
Initializing logs' truncater with mapRetainSize=-1 and reduceRetainSize=-1

2014-06-11 10:18:34,824 WARN org.apache.hadoop.mapred.Child: Error running
child

java.lang.RuntimeException:
org.apache.hadoop.hive.ql.metadata.HiveException: java.io.IOException:
hdfs://<NN><Path>/video_20140611051139 not a SequenceFile

at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:283)

at org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506)

at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447)

at org.apache.hadoop.mapred.Child$4.run(Child.java:268)

at java.security.AccessController.doPrivileged(Native Method)

at javax.security.auth.Subject.doAs(Subject.java:396)

at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408)

at org.apache.hadoop.mapred.Child.main(Child.java:262)

Caused by: org.apache.hadoop.hive.ql.metadata.HiveException:
java.io.IOException: hdfs://<NN><Path>/video/video_20140611051139 not a
SequenceFile

at
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:237)

at
org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74)

at
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.genUniqueJoinObject(CommonJoinOperator.java:644)

at
org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:758)

at
org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256)

at
org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:216)

... 7 more

Caused by: java.io.IOException:
hdfs://<NN><Path>/video/video_20140611051139 not a SequenceFile

at org.apache.hadoop.io.SequenceFile$Reader.init(SequenceFile.java:1805)

at
org.apache.hadoop.io.SequenceFile$Reader.initialize(SequenceFile.java:1765)

at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1714)

at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1728)

at
org.apache.hadoop.mapred.SequenceFileRecordReader.<init>(SequenceFileRecordReader.java:43)

at
org.apache.hadoop.mapred.SequenceFileInputFormat.getRecordReader(SequenceFileInputFormat.java:59)

at org.apache.

Reply via email to