[ 
https://issues.apache.org/jira/browse/HDFS-6867?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14109663#comment-14109663
 ] 

Colin Patrick McCabe commented on HDFS-6867:
--------------------------------------------

[~zhezhang] and I chatted about this offline.  The main takeaway is that we 
might want to asynchronously search for new datanodes when a pipeline is not at 
full strength.  But we don't necessarily need to implement full asynchronous 
recovery in this JIRA (although maybe we could do that later).

Right now, clients that suffer temporary networking problems during pipeline 
recovery have to make an unpleasant choice: live with a short pipeline and the 
consequent under-replication (possibly for a long time after the networking 
problems have gone), or deal with an exception which puts the entire 
DFSOutputStream into an unpleasant error state.  It would be great if instead 
we could allow these clients to continue, but bring the pipeline up to full 
strength once the networking problems went away.

> For DFSOutputStream, do pipeline recovery for a single block in the background
> ------------------------------------------------------------------------------
>
>                 Key: HDFS-6867
>                 URL: https://issues.apache.org/jira/browse/HDFS-6867
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: hdfs-client
>    Affects Versions: 2.0.0-alpha
>            Reporter: Colin Patrick McCabe
>            Assignee: Zhe Zhang
>         Attachments: HDFS-6867-design-20140820.pdf, 
> HDFS-6867-design-20140821.pdf, HDFS-6867-design-20140822.pdf
>
>
> For DFSOutputStream, we should be able to do pipeline recovery in the 
> background, while the user is continuing to write to the file.  This is 
> especially useful for long-lived clients that write to an HDFS file slowly. 



--
This message was sent by Atlassian JIRA
(v6.2#6252)

Reply via email to