[ 
https://issues.apache.org/jira/browse/HDFS-7782?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14393263#comment-14393263
 ] 

Jing Zhao commented on HDFS-7782:
---------------------------------

Thanks for updating the patch, Zhe! Comments on the 005 patch:
# {{DFS_DFSCLIENT_STRIPED_READ_THREADPOOL_SIZE}} can be renamed to 
{{DFS_CLIENT_STRIPED_READ_THREADPOOL_SIZE}}
# The current hard coded schema is actually 3+2. We need to make it consistent 
with the thread pool size selection.
{code}
+  // With default 6+3 schema, each normal read could span 6 DNs. So this
+  // default value accommodates 3 read streams
{code}
# In {{DFSClient#initThreadsNumForStripedReads}}, the DFSClient object's 
monitor cannot protect the static filed {{STRIPED_READ_THREAD_POOL}}.
# In {{initThreadsNumForStripedReads}}, the core pool size can be set to the 
default data block number (instead of 1)?
# We do not need to declare {{planReadPortions}} and {{parseStripedBlockGroup}} 
as "protected"
# Instead of overriding {{hedgedFetchBlockByteRange}} and throwing 
UnsupportedActionException, maybe we can add a check in 
{{DFSInputStream#pread}} to make sure no hedged read for a LocatedStripedBlock. 
Because the hedged read switch is on the DFSClient level instead of 
DFSInputStream, we should allow users to turn on hedged read for contiguous 
blocks while correctly reading striped blocks.
# For {{fetchBlockByteRange}}, I still think we should avoid data copying (also 
as Walter commented). Maybe a simple solution can be, if the given buffer size 
is large, separate the reading into iterations, and in each iteration only read 
one stripe of data in parallel. Another way can be, still plan all the reading 
for each DN ahead, but when reading from each DN, extend the current 
{{getFromOneDataNode}} method to support reading data into multiple different 
segments of the given buffer.

> Erasure coding: pread from files in striped layout
> --------------------------------------------------
>
>                 Key: HDFS-7782
>                 URL: https://issues.apache.org/jira/browse/HDFS-7782
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Li Bo
>            Assignee: Zhe Zhang
>         Attachments: HDFS-7782-000.patch, HDFS-7782-001.patch, 
> HDFS-7782-002.patch, HDFS-7782-003.patch, HDFS-7782-004.patch, 
> HDFS-7782-005.patch
>
>
> If client wants to read a file, he is not necessary to know and handle what 
> layout the file is. This sub task adds logic to DFSInputStream to support 
> reading striping layout files.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to