[jira] [Commented] (HDFS-8383) Tolerate multiple failures in DFSStripedOutputStream

Walter Su (JIRA) Sun, 06 Sep 2015 03:10:34 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-8383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732332#comment-14732332
 ]


Walter Su commented on HDFS-8383:
---------------------------------

bq. When only one streamer fails, do we need to do anything? I think we can 
just ignore the failed streamer unless more than 3 streamers are found failed. 
The offline decode work will be started by some datanode later.
maintenance of the correctness of UC.replicas is requied by lease recovery.

bq. I think it’s not right to set the failed status of streamer in outputstream 
due to the asynchronization.
So I make it a follow-on.

bq. Not very clear about the error handling. For example, streamer_i fails to 
write a packet of block_j, but it succeeds to write block_j+1, could you give 
some detailed description about this situation?
Are you talking about different block groups? we haven't solved restarting 
streamer for single failure yet. This jira doesn't care about two failure from 
two block groups. It should not be a problem once single failure solved. except 
fowllow-on #4 of my last comment. 

> Tolerate multiple failures in DFSStripedOutputStream
> ----------------------------------------------------
>
>                 Key: HDFS-8383
>                 URL: https://issues.apache.org/jira/browse/HDFS-8383
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Walter Su
>         Attachments: HDFS-8383.00.patch, HDFS-8383.01.patch
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (HDFS-8383) Tolerate multiple failures in DFSStripedOutputStream

Reply via email to