[ https://issues.apache.org/jira/browse/HIVE-6320?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Prasanth J updated HIVE-6320: ----------------------------- Description: ORC data reader crashes out on a BufferUnderflowException, while trying to read data row-by-row with the predicate push-down enabled on current trunk. Stack trace: {code} Caused by: java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:472) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101) {code} OR it could be {code} Caused by: java.lang.IndexOutOfBoundsException at java.nio.ByteBuffer.wrap(ByteBuffer.java:352) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:180) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:197) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:252) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:59) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:300) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:475) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1159) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2198) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:108) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:57) at org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) ... 15 more {code} The query run is {code} set hive.vectorized.execution.enabled=false; set hive.optimize.index.filter=true; insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey is not null; {code} Reason: was: ORC data reader crashes out on a BufferUnderflowException, while trying to read data row-by-row with the predicate push-down enabled on current trunk. {code} Caused by: java.nio.BufferUnderflowException at java.nio.Buffer.nextGetIndex(Buffer.java:472) at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117) at org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207) at org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53) at org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581) at org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125) at org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101) {code} The query run is {code} set hive.vectorized.execution.enabled=false; set hive.optimize.index.filter=true; insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey is not null; {code} > Row-based ORC reader with PPD turned on dies on > BufferUnderFlowException/IndexOutOfBoundsException > --------------------------------------------------------------------------------------------------- > > Key: HIVE-6320 > URL: https://issues.apache.org/jira/browse/HIVE-6320 > Project: Hive > Issue Type: Bug > Components: Serializers/Deserializers > Affects Versions: 0.13.0 > Reporter: Gopal V > Assignee: Prasanth J > Labels: orcfile > Fix For: 0.13.0 > > Attachments: HIVE-6320.1.patch, HIVE-6320.2.patch, HIVE-6320.2.patch, > HIVE-6320.3.patch > > > ORC data reader crashes out on a BufferUnderflowException, while trying to > read data row-by-row with the predicate push-down enabled on current trunk. > Stack trace: > {code} > Caused by: java.nio.BufferUnderflowException > at java.nio.Buffer.nextGetIndex(Buffer.java:472) > at java.nio.HeapByteBuffer.get(HeapByteBuffer.java:117) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:207) > at > org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:240) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:53) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:288) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$IntTreeReader.next(RecordReaderImpl.java:510) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1581) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2707) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:125) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:101) > {code} > OR it could be > {code} > Caused by: java.lang.IndexOutOfBoundsException > at java.nio.ByteBuffer.wrap(ByteBuffer.java:352) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.readHeader(InStream.java:180) > at > org.apache.hadoop.hive.ql.io.orc.InStream$CompressedStream.read(InStream.java:197) > at > org.apache.hadoop.hive.ql.io.orc.SerializationUtils.readInts(SerializationUtils.java:450) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readDirectValues(RunLengthIntegerReaderV2.java:252) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.readValues(RunLengthIntegerReaderV2.java:59) > at > org.apache.hadoop.hive.ql.io.orc.RunLengthIntegerReaderV2.next(RunLengthIntegerReaderV2.java:300) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$LongTreeReader.next(RecordReaderImpl.java:475) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl$StructTreeReader.next(RecordReaderImpl.java:1159) > at > org.apache.hadoop.hive.ql.io.orc.RecordReaderImpl.next(RecordReaderImpl.java:2198) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:108) > at > org.apache.hadoop.hive.ql.io.orc.OrcInputFormat$OrcRecordReader.next(OrcInputFormat.java:57) > at > org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:274) > ... 15 more > {code} > The query run is > {code} > set hive.vectorized.execution.enabled=false; > set hive.optimize.index.filter=true; > insert overwrite directory '/tmp/foo' select * from lineitem where l_orderkey > is not null; > {code} > Reason: -- This message was sent by Atlassian JIRA (v6.2#6252)