[
https://issues.apache.org/jira/browse/HDFS-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890909#comment-13890909
]
Kihwal Lee commented on HDFS-5881:
----------------------------------
As the comments added by HDFS-2834 along with the fix,
{{this.offsetFromChunkBoundary}} shouldn't be set before reading to skip data,
or it will end up skipping as many as {{this.offsetFromChunkBoundary}} bytes
more.
{code}
// We can't use this.offsetFromChunkBoundary because we need to know how
// many bytes of the offset were really read. Calling read(..) with a
// positive this.offsetFromChunkBoundary causes that many bytes to get
// silently skipped.
{code}
Instead, a big skip() should do this:
- Set {{this.offsetFromChunkBoundary}} to 0.
- Call read() to read the new chunk offset bytes. This effectively skips chunk
offset bytes in the internal buffer.
OR
- Set {{this.offsetFromChunkBoundary}} to the new chunk offset.
- Don't call read() in skip().
> Back-port the skip() fix in the short-circuit local reader to 0.23.
> -------------------------------------------------------------------
>
> Key: HDFS-5881
> URL: https://issues.apache.org/jira/browse/HDFS-5881
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 0.23.10
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Critical
>
> It looks like a bug in skip() was introduced by HDFS-2356 and got fixed as a
> part of HDFS-2834, which is an API change JIRA. This bug causes to skip more
> (as many as the new offsetFromChunkBoundary) data in certain cases.
> It is only for branch-0.23.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)