[ 
https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14504428#comment-14504428
 ] 

Yi Liu edited comment on HDFS-8033 at 4/21/15 6:33 AM:
-------------------------------------------------------

Thanks [~zhz] for working on this.  The patch is good, my comments:
*1.*  In DFSInputStream, the stateful read is not to read fully for the output 
*buf*,  {{readWithStrategy}} will call {{readBuffer}} and return on success.  
In {{DFSStripedInputStream}} we override {{readBuffer}}, but we only read in 
one striped block, so the returned result should be something like (cell_0, 
cell_3, ....) and it only contains part of the expected data. 
This is not incorrect,  in the test, you have tested stateful read, but you do 
fully read and the data size is *BLOCK_GROUP_SIZE*, so the result 
coincidentally is correct. 
I suggest we try to do fully read in {{readBuffer}} of 
{{DFSStripedInputStream}} unless we find the end of file, of course, the final 
read length could be less than the input buf length if we get eof.

*2.* In {{blockSeekTo}}, we need to handle refetchToken and 
refetchEncryptionKey. And for other IOException, we can throw it.

*3.*  For the test, do stateful read: read once and fully read (please make the 
data size large than groupSize * cellSize), as I said in #1,

*4.*  {{connectFailedOnce}} in {{blockSeekTo}} is not necessary.

*5.*  Why you modify {{SimulatedFSDataset}}?


was (Author: hitliuyi):
Thanks [~zhz] for working on this.  The patch is good, my comments:
*1.*  In DFSInputStream, the stateful read is not to read fully for the output 
*buf*,  {{readWithStrategy}} will call {{readBuffer}} and return on success.  
In {{DFSStripedInputStream}} we override {{readBuffer}}, but we only read in 
one striped block, so the returned result should be something like (cell_0, 
cell_3, ....).  
This is not incorrect,  in the test, you have tested stateful read, but you do 
fully read and the data size is *BLOCK_GROUP_SIZE*, so the result 
coincidentally is correct. 
I suggest we try to do fully read in {{readBuffer}} of 
{{DFSStripedInputStream}} unless we find the end of file, of course, the final 
read length could be less than the input buf length if we get eof.

*2.* In {{blockSeekTo}}, we need to handle refetchToken and 
refetchEncryptionKey. And for other IOException, we can throw it.

*3.*  For the test, do stateful read: read once and fully read (please make the 
data size large than groupSize * cellSize), as I said in #1,

*4.*  {{connectFailedOnce}} in {{blockSeekTo}} is not necessary.

*5.*  Why you modify {{SimulatedFSDataset}}?

> Erasure coding: stateful (non-positional) read from files in striped layout
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-8033
>                 URL: https://issues.apache.org/jira/browse/HDFS-8033
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8033.000.patch, HDFS-8033.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to