[ 
https://issues.apache.org/jira/browse/HIVE-11031?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14602613#comment-14602613
 ] 

Demeter Sztanko commented on HIVE-11031:
----------------------------------------

Hello [~prasanth_j], my MR jobs are getting this error when concatenating ORC 
files:

{code}
java.io.IOException: java.io.IOException: java.lang.IndexOutOfBoundsException: 
Index: 0
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:226)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.next(HadoopShimsSecure.java:136)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.moveToNext(MapTask.java:230)
        at 
org.apache.hadoop.mapred.MapTask$TrackedRecordReader.next(MapTask.java:210)
        at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:48)
        at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:366)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:415)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1190)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)
Caused by: java.io.IOException: java.lang.IndexOutOfBoundsException: Index: 0
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerChain.handleRecordReaderNextException(HiveIOExceptionHandlerChain.java:121)
        at 
org.apache.hadoop.hive.io.HiveIOExceptionHandlerUtil.handleRecordReaderNextException(HiveIOExceptionHandlerUtil.java:77)
        at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:355)
        at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:105)
        at 
org.apache.hadoop.hive.ql.io.CombineHiveRecordReader.doNext(CombineHiveRecordReader.java:41)
        at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.next(HiveContextAwareRecordReader.java:116)
        at 
org.apache.hadoop.hive.shims.HadoopShimsSecure$CombineFileRecordReader.doNextWithExceptionHandler(HadoopShimsSecure.java:224)
        ... 11 more
Caused by: java.lang.IndexOutOfBoundsException: Index: 0
        at java.util.Collections$EmptyList.get(Collections.java:3212)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.nextStripe(OrcFileStripeMergeRecordReader.java:82)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.next(OrcFileStripeMergeRecordReader.java:71)
        at 
org.apache.hadoop.hive.ql.io.orc.OrcFileStripeMergeRecordReader.next(OrcFileStripeMergeRecordReader.java:31)
        at 
org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
        ... 15 more
2015-06-26 08:24:19,248 INFO org.apache.hadoop.mapred.Task: Runnning cleanup 
for the task
{code}

Is this failure a result of the bug described in this ticket or that can be a 
different problem?

> ORC concatenation of old files can fail while merging column statistics
> -----------------------------------------------------------------------
>
>                 Key: HIVE-11031
>                 URL: https://issues.apache.org/jira/browse/HIVE-11031
>             Project: Hive
>          Issue Type: Bug
>    Affects Versions: 0.13.0, 0.14.0, 1.0.0, 1.2.0, 1.1.0, 2.0.0
>            Reporter: Prasanth Jayachandran
>            Assignee: Prasanth Jayachandran
>            Priority: Critical
>             Fix For: 1.2.1
>
>         Attachments: HIVE-11031-branch-1.0.patch, HIVE-11031.2.patch, 
> HIVE-11031.3.patch, HIVE-11031.4.patch, HIVE-11031.patch
>
>
> Column statistics in ORC are optional protobuf fields. Old ORC files might 
> not have statistics for newly added types like decimal, date, timestamp etc. 
> But column statistics merging assumes column statistics exists for these 
> types and invokes merge. For example, merging of TimestampColumnStatistics 
> directly casts the received ColumnStatistics object without doing instanceof 
> check. If the ORC file contains time stamp column statistics then this will 
> work else it will throw ClassCastException.
> Also, the file merge operator swallows the exception.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to