[ 
https://issues.apache.org/jira/browse/HDFS-7348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14520686#comment-14520686
 ] 

Yi Liu commented on HDFS-7348:
------------------------------

Thanks Bo and Zhe, for the discussion.

{quote}
On the write path:
...
{quote}
Current implementation the recovery node is one of the source node, of course 
we can change as it can also be on one of target in future, actually there is 
no big difference (we can read locally for the first one, and write locally for 
the second one), both are reasonable for me.

Assume the recovery node is one of target (final destination, "fast track" as 
you said), it's an optimization of current patch, we need to write it directly 
to target folder (as we receive a continuous block replication in datanode), 
but we don't need {{DataNode#DataTransfer}}, if there are more than one 
targets, then it's a different decoded block, we need to transfer it to another 
target.

The situation becomes:
  1) If one target is remote, then we send the block directly.
  2) If the target is local, we write it to local directly.
So it will never happen: we need to save the decoded block locally and then 
need to transfer again. 
 
Hi Guys, send a block directly is not a big deal, don't need to force to use 
{{DataNode#DataTransfer}}, of course we may can refine the common part out,  
further more, as I said before, we may need to check the packet ack in future.

{quote}
On the read path:
....
{quote}
Currently the default cell size is 256KB, it's not a small value, we can also 
expose it as a configuration. 
About the sequential vs. parallel reading, if it's a pain decision, then I can 
make it configurable too.


So guys, let's have a *conclusion*:  how about we do as following:
*1.* Mainly keep the current approach in the patch.   Also as I said in the 
patch design,  we need to do one optimization: if one source is local, we 
should read it directly, [~zhz], I think you can do this further improvement in 
your patch about block reader, maybe in phase 2? since it will not block the 
functionality.

*2.* About write locally, certainly if the recovery node is one of target, then 
we should write it directly to datanode.  But the write is directly to target 
location as we receive a continuous block replication in datanode, we don't 
need to transfer it again.  [~libo-intel], you can do this further improvement 
in your patch about block writer, maybe also in phase 2, currently it will not 
block the functionality?

*3.*  I make the buffer size configurable, and also sequential configurable, 
also in a following one JIRA?

*4.*  I file a follow-on to check the packet ack and can do it later?

The remaining thing is to wait for the decode of HADOOP-11847, and I update 
that part in the patch and also the test.  Of course, you guys please review 
the existing code too. Thanks, sounds good?

> Erasure Coding: striped block recovery
> --------------------------------------
>
>                 Key: HDFS-7348
>                 URL: https://issues.apache.org/jira/browse/HDFS-7348
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: datanode
>            Reporter: Kai Zheng
>            Assignee: Yi Liu
>         Attachments: ECWorker.java, HDFS-7348.001.patch
>
>
> This JIRA is to recover one or more missed striped block in the striped block 
> group.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to