[ 
https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14973816#comment-14973816
 ] 

Walter Su commented on HDFS-9079:
---------------------------------

bq. It seems that some failure cases were not considered. For example, what 
happen if the client dies after some of the streamers updated the GS but some 
are not?
If so, last step "6) Updates block on NN", aka "updatePipeline(newGS)" is not 
called. It's the same as the old way for non-ec blocks. Since both GS of 
updated/non-updated blocks are >= than GS of storedBlock, they can be used for 
block recovery.

> Erasure coding: preallocate multiple generation stamps and serialize updates 
> from data streamers
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-9079
>                 URL: https://issues.apache.org/jira/browse/HDFS-9079
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: erasure-coding
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch, 
> HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch, HDFS-9079.05.patch
>
>
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) 
> Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) 
> Updates block on NN
> {code}
> To simplify the above we can preallocate GS when NN creates a new striped 
> block group ({{FSN#createNewBlock}}). For each new striped block group we can 
> reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can 
> be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we 
> shouldn't try to further recover anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to