[ 
https://issues.apache.org/jira/browse/HDFS-8033?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14505991#comment-14505991
 ] 

Zhe Zhang commented on HDFS-8033:
---------------------------------

Thanks for the helpful comments Yi and Jing.

bq. In DFSStripedInputStream we override readBuffer, but we only read in one 
striped block, so the returned result should be something like (cell_0, cell_3, 
....) and it only contains part of the expected data
{{DFSStripedInputStream#readBuffer}} does switch the {{blockReader}}. So after 
reading cell_0, we'll switch to the next {{blockReader}} and read cell_1. 

It's very helpful that you brought up the _short read_ issue. In current 
{{DFSInputStream}}, stateful read calls {{blockReader.read()}} once, which 
returns all remaining data in the {{blockReader}}'s buffer; the size is most 
likely 64K bytes ({{BlockSender#MIN_BUFFER_WITH_TRANSFERTO}}. I had an offline 
discussion with [~cmccabe] about this behavior. It seems the rationale is to 
return as fast as possible with all cached data. Given our default cell size 
(128K or 256K), if we inherit the behavior from {{DFSInputStream}} and return 
64K at a time, in most cases we won't cross cell boundary in a single 
{{read()}} anyway. So I didn't add the logic of reading across cell boundary in 
the patch. It's not too hard to add though, once we make a decision. But I 
think we should keep the behavior of trying to return with buffered data 
(instead of trying to read up to the request length). 

bq. In blockSeekTo, we need to handle refetchToken and refetchEncryptionKey. 
And for other IOException, we can throw it.
Good point. Since all EC internal blocks only has 1 destination DN, we won't 
have the _while_ loop to count retries. We can retry on different internal 
blocks.

bq. For the test, do stateful read: read once and fully read (please make the 
data size large than groupSize * cellSize), as I said in #1,
Will test reading multiple {{BLOCK_GROUP_SIZE}} to verify {{blockSeekTo}} 
switches between block groups correctly.

bq. connectFailedOnce in blockSeekTo is not necessary.
I agree, will remove it.

bq. Why you modify SimulatedFSDataset?
Once HDFS-8191 is in that won't be needed.

> Erasure coding: stateful (non-positional) read from files in striped layout
> ---------------------------------------------------------------------------
>
>                 Key: HDFS-8033
>                 URL: https://issues.apache.org/jira/browse/HDFS-8033
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-8033.000.patch, HDFS-8033.001.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to