[ 
https://issues.apache.org/jira/browse/HDFS-5881?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13891652#comment-13891652
 ] 

Kihwal Lee commented on HDFS-5881:
----------------------------------

This bug is even more lovelier than I originally thought.  skip() has another 
bug of returning wrong value. In this case, DFSInputStream regards the skip 
failed and creates a new BlockReaderLocal for subsequent reads. So the effect 
of original skip bug was sometimes hidden and incurred unnecessary overhead.

This "bug-masking bug" is not effective when the remaining data in the internal 
32KB buffer is none. I.e. the return value from skip() is correct and the same 
BlockReaderLocal instance is reused. So, after a chunk-aligned 32KB read and a 
skip/seek, followed by a read will hit the original bug, which returns wrong 
data.

The fix will make random reads faster and return correct data.

> Fix skip() of the short-circuit local reader in 0.23.
> -----------------------------------------------------
>
>                 Key: HDFS-5881
>                 URL: https://issues.apache.org/jira/browse/HDFS-5881
>             Project: Hadoop HDFS
>          Issue Type: Bug
>    Affects Versions: 0.23.10
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>
> It looks like a bug in skip() was introduced by HDFS-2356 and got fixed as a 
> part of HDFS-2834, which is an API change JIRA.  This bug causes to skip more 
> (as many as the new offsetFromChunkBoundary) data in certain cases.
> It is only for branch-0.23.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to