[ https://issues.apache.org/jira/browse/HIVE-5970?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13841743#comment-13841743 ]
Aleksei commented on HIVE-5970: ------------------------------- My findings show that there is a problem in run length encoding. You can reproduce the problem by doing the following steps: 1. Create the table: {code:sql} CREATE TABLE test_orc_format( site STRING, a DOUBLE, b BIGINT, c BIGINT, d BIGINT, e DOUBLE, f DOUBLE, g DOUBLE, h DOUBLE, i DOUBLE, j DOUBLE, k BIGINT, l BIGINT, m BIGINT, n BIGINT, o BIGINT, p BIGINT, q ARRAY<DOUBLE>, r ARRAY<DOUBLE> ) ROW FORMAT DELIMITED FIELDS TERMINATED BY '\t' COLLECTION ITEMS TERMINATED BY ',' STORED AS ORC ; {code} 2. Load the data from attached file. {code:sql} load data local inpath 'test_data' overwrite into test_orc_format; {code} 3. Use one of the following queries: {code:sql} select * from test_orc_format; select o from test_orc_format; {code} Note, the attached file was created by hive during a job execution and not crafted by hands, it might be wrongly encoded as well. Also, note that the query that does calculation for column "o" cannot give negative results. > ArrayIndexOutOfBoundsException in RunLengthIntegerReaderV2.java > --------------------------------------------------------------- > > Key: HIVE-5970 > URL: https://issues.apache.org/jira/browse/HIVE-5970 > Project: Hive > Issue Type: Bug > Components: File Formats > Affects Versions: 0.12.0 > Reporter: Eric Chu > Priority: Critical > Labels: orcfile > Attachments: test_data > > > A workload involving ORC tables starts getting the following > ArrayIndexOutOfBoundsException AFTER the upgrade to Hive 0.12. The file is > added as part of HIVE-4123. > 2013-12-04 14:42:08,537 ERROR > cause:java.io.IOException: java.io.IOException: > java.lang.ArrayIndexOutOfBoundsException: 0 > 2013-12-04 14:42:08,537 WARN org.apache.hadoop.mapred.Child: Error running > child > java.io.IOException: java.io.IOException: > java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:304) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:220) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:215) > at > org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:200) > at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48) > at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:417) > at org.apache.hadoop.mapred.MapTask.run(MapTask.java:332) > at org.apache.hadoop.mapred.Child$4.run(Child.java:268) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:396) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1408) > at org.apache.hadoop.mapred.Child.main(Child.java:262) > Caused by: java.io.IOException: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121) > at > org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:276) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:101) > at > org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:108) > at > org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:302) > ... 11 more > Caused by: java.lang.ArrayIndexOutOfBoundsException: 0 > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readPatchedBaseValues(RunLengthIntegerReaderV2.java:171) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:54) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:287) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:473) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1157) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2196) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:129) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:80) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) > ... 15 more -- This message was sent by Atlassian JIRA (v6.1#6144)