[
https://issues.apache.org/jira/browse/HDFS-8383?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14732332#comment-14732332
]
Walter Su commented on HDFS-8383:
---------------------------------
bq. When only one streamer fails, do we need to do anything? I think we can
just ignore the failed streamer unless more than 3 streamers are found failed.
The offline decode work will be started by some datanode later.
maintenance of the correctness of UC.replicas is requied by lease recovery.
bq. I think it’s not right to set the failed status of streamer in outputstream
due to the asynchronization.
So I make it a follow-on.
bq. Not very clear about the error handling. For example, streamer_i fails to
write a packet of block_j, but it succeeds to write block_j+1, could you give
some detailed description about this situation?
Are you talking about different block groups? we haven't solved restarting
streamer for single failure yet. This jira doesn't care about two failure from
two block groups. It should not be a problem once single failure solved. except
fowllow-on #4 of my last comment.
> Tolerate multiple failures in DFSStripedOutputStream
> ----------------------------------------------------
>
> Key: HDFS-8383
> URL: https://issues.apache.org/jira/browse/HDFS-8383
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Reporter: Tsz Wo Nicholas Sze
> Assignee: Walter Su
> Attachments: HDFS-8383.00.patch, HDFS-8383.01.patch
>
>
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)