[ 
https://issues.apache.org/jira/browse/HDFS-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13890909#comment-13890909
 ] 

Kihwal Lee commented on HDFS-5881:
----------------------------------

As the comments added by HDFS-2834 along with the fix, 
{{this.offsetFromChunkBoundary}} shouldn't be set before reading to skip data, 
or it will end up skipping as many as {{this.offsetFromChunkBoundary}} bytes 
more.

{code}
    // We can't use this.offsetFromChunkBoundary because we need to know how
    // many bytes of the offset were really read. Calling read(..) with a
    // positive this.offsetFromChunkBoundary causes that many bytes to get
    // silently skipped.
{code}

Instead, a big skip() should do this:
- Set {{this.offsetFromChunkBoundary}} to 0.
- Call read() to read the new chunk offset bytes. This effectively skips chunk 
offset bytes in the internal buffer.

OR

- Set {{this.offsetFromChunkBoundary}} to the new chunk offset.
- Don't call read() in skip().

> Back-port the skip() fix in the short-circuit local reader to 0.23.
> -------------------------------------------------------------------
>
>                 Key: HDFS-5881
>                 URL: https://issues.apache.org/jira/browse/HDFS-5881
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.23.10
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>
> It looks like a bug in skip() was introduced by HDFS-2356 and got fixed as a 
> part of HDFS-2834, which is an API change JIRA.  This bug causes to skip more 
> (as many as the new offsetFromChunkBoundary) data in certain cases.
> It is only for branch-0.23.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to