[
https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14132121#comment-14132121
]
Zhe Zhang commented on HDFS-6867:
---------------------------------
[~cmccabe] Thanks for the suggestion. I think Future is a good idea. To further
simplify the design I have made {{RecoveryWorker}} more _stateless_ -- instead
of keep running until full strength replication is built, it finishes as soon
as at least one new DN is found. Then {{DataStreamer}} tries to recover the
pipeline, and fires another {{RecoveryWorker}} if the strength is still not
full.
bq. I think to do this right, we'll need to modify an existing NameNode RPC
like updatePipeline, or perhaps add a new RPC, that just gives us a new
DataNode suitable for our pipeline, but does not change the state of the block
on the NN. Basically, this means breaking updatePipeline into two separate
calls... one to find a good node (that RecoveryWorker will call), and *another
to actually update the block to be in RBR (that DataStreamer will call)*.
In the second call (highlighted above) the NN needs to recalculate the good DN
to use right? Since the mission of the {{RecoveryWorker}} is to provide hint to
the {{DataStreamer}} suggesting good timings to trigger the real recovery
process, I think we do need a new RPC, which is a lightweight _peek_ call to
indicate (with reasonable accuracy) whether the recovery will succeed. E.g.,
the NN can quickly verify that the number of live DNs is least larger than
{{nodes.length}} in the current pipeline.
> For DFSOutputStream, do pipeline recovery for a single block in the background
> ------------------------------------------------------------------------------
>
> Key: HDFS-6867
> URL: https://issues.apache.org/jira/browse/HDFS-6867
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: hdfs-client
> Reporter: Colin Patrick McCabe
> Assignee: Zhe Zhang
> Attachments: HDFS-6867-20140827-2.patch, HDFS-6867-20140827-3.patch,
> HDFS-6867-20140827.patch, HDFS-6867-20140828-1.patch,
> HDFS-6867-20140828-2.patch, HDFS-6867-20140910.patch,
> HDFS-6867-20140911.patch, HDFS-6867-design-20140820.pdf,
> HDFS-6867-design-20140821.pdf, HDFS-6867-design-20140822.pdf,
> HDFS-6867-design-20140827.pdf, HDFS-6867-design-20140910.pdf
>
>
> For DFSOutputStream, we should be able to do pipeline recovery in the
> background, while the user is continuing to write to the file. This is
> especially useful for long-lived clients that write to an HDFS file slowly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)