Hi Ashutosh, Thanks for your reply.
Not sure if HIVE-9324 is the same issue we met. We found it is a bug in CDH when using MR1 with hive 0.13.1. This bug does not exist when using yarn with 0.13.1. Guodong On Fri, Jan 16, 2015 at 1:21 AM, Ashutosh Chauhan <hashut...@apache.org> wrote: > Seems like you are hitting into : > https://issues.apache.org/jira/browse/HIVE-9324 > > On Thu, Jan 15, 2015 at 1:53 AM, Guodong Wang <wangg...@gmail.com> wrote: > >> Hi, >> >> I am using hive 0.13.1 and currently I am blocked by a bug when joining 2 >> tables. Here is the sample query. >> >> INSERT OVERWRITE TABLE test_archive PARTITION(data='2015-01-17', name, >> type) >> SELECT COALESCE(b.resource_id, a.id) AS id, >> a.timstamp, >> a.payload, >> a.name, >> a.type >> FROM test_data a LEFT OUTER JOIN id_mapping b on a.id = b.id >> WHERE a.date='2015-01-17' >> AND a.name IN ('a‘, 'b', 'c') >> AND a.type <= 14; >> >> It turns out that when there are more than 25000 joins rows on a specific >> id, hive MR job fails, throwing NegativeArraySizeException. >> >> Here is the stack trace >> >> 2015-01-15 14:38:42,693 ERROR >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer: >> java.lang.NegativeArraySizeException >> at >> org.apache.hadoop.io.BytesWritable.setCapacity(BytesWritable.java:144) >> at org.apache.hadoop.io.BytesWritable.setSize(BytesWritable.java:123) >> at org.apache.hadoop.io.BytesWritable.readFields(BytesWritable.java:179) >> at >> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71) >> at >> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42) >> at >> org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:2244) >> at >> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2228) >> at >> org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:103) >> at >> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:78) >> at >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.nextBlock(RowContainer.java:360) >> at >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:230) >> at >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74) >> at >> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:740) >> at >> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256) >> at >> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:216) >> at >> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506) >> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) >> at org.apache.hadoop.mapred.Child.main(Child.java:262) >> 2015-01-15 14:38:42,707 FATAL ExecReducer: >> org.apache.hadoop.hive.ql.metadata.HiveException: >> org.apache.hadoop.hive.ql.metadata.HiveException: >> java.lang.NegativeArraySizeException >> at >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:237) >> at >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:74) >> at >> org.apache.hadoop.hive.ql.exec.CommonJoinOperator.checkAndGenObject(CommonJoinOperator.java:740) >> at >> org.apache.hadoop.hive.ql.exec.JoinOperator.endGroup(JoinOperator.java:256) >> at >> org.apache.hadoop.hive.ql.exec.mr.ExecReducer.reduce(ExecReducer.java:216) >> at >> org.apache.hadoop.mapred.ReduceTask.runOldReducer(ReduceTask.java:506) >> at org.apache.hadoop.mapred.ReduceTask.run(ReduceTask.java:447) >> at org.apache.hadoop.mapred.Child$4.run(Child.java:268) >> at java.security.AccessController.doPrivileged(Native Method) >> at javax.security.auth.Subject.doAs(Subject.java:415) >> at >> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1614) >> at org.apache.hadoop.mapred.Child.main(Child.java:262) >> Caused by: org.apache.hadoop.hive.ql.metadata.HiveException: >> java.lang.NegativeArraySizeException >> at >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.nextBlock(RowContainer.java:385) >> at >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.first(RowContainer.java:230) >> ... 11 more >> Caused by: java.lang.NegativeArraySizeException >> at >> org.apache.hadoop.io.BytesWritable.setCapacity(BytesWritable.java:144) >> at org.apache.hadoop.io.BytesWritable.setSize(BytesWritable.java:123) >> at org.apache.hadoop.io.BytesWritable.readFields(BytesWritable.java:179) >> at >> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:71) >> at >> org.apache.hadoop.io.serializer.WritableSerialization$WritableDeserializer.deserialize(WritableSerialization.java:42) >> at >> org.apache.hadoop.io.SequenceFile$Reader.deserializeValue(SequenceFile.java:2244) >> at >> org.apache.hadoop.io.SequenceFile$Reader.getCurrentValue(SequenceFile.java:2228) >> at >> org.apache.hadoop.mapred.SequenceFileRecordReader.getCurrentValue(SequenceFileRecordReader.java:103) >> at >> org.apache.hadoop.mapred.SequenceFileRecordReader.next(SequenceFileRecordReader.java:78) >> at >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer.nextBlock(RowContainer.java:360) >> ... 12 more >> >> >> I found that when the exceptions are thrown. There is a log like this >> >> 2015-01-15 14:38:42,045 INFO >> org.apache.hadoop.hive.ql.exec.persistence.RowContainer: RowContainer >> created temp file >> /local/data0/mapred/taskTracker/ubuntu/jobcache/job_201412171918_0957/attempt_201412171918_0957_r_000000_0/work/tmp/hive-rowcontainer5023288010679723993/RowContainer5093924743042924240.tmp >> >> >> Looks like when RowContainer collects more than 25000 row records. >> It will flush out the block to local disk. But it can not read >> these blocks out. >> >> Any help is really appreciated! >> >> >> >> Guodong >> > >