[ 
https://issues.apache.org/jira/browse/HDFS-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14375609#comment-14375609
 ] 

Li Bo commented on HDFS-7782:
-----------------------------

Thanks for the work on striped reading, Zhe! 

Logics:
1.      In patch-002 you use a single block reader in DFSStripedInputStream, I 
think that’s a smart implementation, and makes the patch much smaller.  My 
concern is about the efficiency. In DFSStripedInputStream, after reading a 
cell, {{blockReader}} will be closed and a new block reader will be opened for 
the next cell. This increases the times of opening and closing a block reader. 
For example, 3 blocks with 128M size, cell size is 1M, DFSInputStream will open 
3 block readers while DFSStripedInputStream has to open  and close 128*3 block 
readers. The frequent opening and closing of block reader may decrease the 
efficiency of DFSStripedInputStream.
2.      When a stripe is not intact, some zero bytes will be padded. When we 
read the last stripe, we has to remove these zero bytes.  Maybe more logics are 
needed to handle the padded zero bytes.

Some minor points:
1.      some import entries are combined
2.      In the constructor of DFSStripedInputStream, the log message is 
{{“Reading from a stripe file”}}. At this time the reading has not started and 
user may not immediately call {{read(…)}}, how about moving this message to 
function {{read(…)}} or changing it to other content such as {{“Create a 
striped DFSInputStream”}} ?


> Read a striping layout file from client side
> --------------------------------------------
>
>                 Key: HDFS-7782
>                 URL: https://issues.apache.org/jira/browse/HDFS-7782
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7782-000.patch, HDFS-7782-001.patch, 
> HDFS-7782-002.patch
>
>
> If client wants to read a file, he is not necessary to know and handle what 
> layout the file is. This sub task adds logic to DFSInputStream to support 
> reading striping layout files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to