[ 
https://issues.apache.org/jira/browse/HIVE-10161?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Sergey Shelukhin resolved HIVE-10161.
-------------------------------------
    Resolution: Fixed

When multiple RGs include the same partial CB (due to ORC end boundary being an 
estimate), the first one reads the length, determines that this is an 
incomplete RG and bails, the 2nd one tries to blindly read the length but the 
compressed block is now offset by 3 bytes from the original read. Boom! Fixed 
that, also fixed some small issue with early unlocking.

> LLAP: ORC file contains compression buffers larger than bufferSize (OR reader 
> has a bug)
> ----------------------------------------------------------------------------------------
>
>                 Key: HIVE-10161
>                 URL: https://issues.apache.org/jira/browse/HIVE-10161
>             Project: Hive
>          Issue Type: Sub-task
>    Affects Versions: llap
>            Reporter: Gopal V
>            Assignee: Sergey Shelukhin
>             Fix For: llap
>
>
> The EncodedReaderImpl will die when reading from the cache, when reading data 
> written by the regular ORC writer 
> {code}
> Caused by: java.io.IOException: java.lang.IllegalArgumentException: Buffer 
> size too small. size = 262144 needed = 3919246
>         at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.rethrowErrorIfAny(LlapInputFormat.java:249)
>         at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.nextCvb(LlapInputFormat.java:201)
>         at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:140)
>         at 
> org.apache.hadoop.hive.llap.io.api.impl.LlapInputFormat$LlapRecordReader.next(LlapInputFormat.java:96)
>         at 
> org.apache.hadoop.hive.ql.io.HiveContextAwareRecordReader.doNext(HiveContextAwareRecordReader.java:350)
>         ... 22 more
> Caused by: java.lang.IllegalArgumentException: Buffer size too small. size = 
> 262144 needed = 3919246
>         at 
> org.apache.hadoop.hive.ql.io.orc.InStream.addOneCompressionBuffer(InStream.java:780)
>         at 
> org.apache.hadoop.hive.ql.io.orc.InStream.uncompressStream(InStream.java:628)
>         at 
> org.apache.hadoop.hive.ql.io.orc.EncodedReaderImpl.readEncodedColumns(EncodedReaderImpl.java:309)
>         at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:278)
>         at 
> org.apache.hadoop.hive.llap.io.encoded.OrcEncodedDataReader.callInternal(OrcEncodedDataReader.java:48)
>         at 
> org.apache.hadoop.hive.common.CallableWithNdc.call(CallableWithNdc.java:37)
>         ... 4 more
> ]], Vertex failed as one or more tasks failed. failedTasks:1, Vertex 
> vertex_1424502260528_1945_1_00 [Map 1] killed/failed due to:null]
> {code}
> Turning off hive.llap.io.enabled makes the error go away.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to