[
https://issues.apache.org/jira/browse/HIVE-16671?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16011551#comment-16011551
]
Prasanth Jayachandran commented on HIVE-16671:
----------------------------------------------
the 3 bytes check makes me think if this is somehow related to splits starting
at 0 vs 3?
When BI split strategy is chosen, entire file/block could become a split? Say
if a file is 1000 bytes. Split offset will be 0 and length will be 1000.
Whereas for the same file, if ETL split strategy is chosen, split offset will
be 3 and length will be 997. First 3 bytes are ignored as that is part of ORC
magic header.
Do you have a repro for this issue? If so could you check the split boundaries
to make sure if this is the case.
> LLAP IO: BufferUnderflowException may happen in very rare(?) cases due to ORC
> end-of-CB estimation
> --------------------------------------------------------------------------------------------------
>
> Key: HIVE-16671
> URL: https://issues.apache.org/jira/browse/HIVE-16671
> Project: Hive
> Issue Type: Bug
> Reporter: Ravi Mutyala
> Assignee: Sergey Shelukhin
> Attachments: HIVE-16671.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)