[ 
https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14909979#comment-14909979
 ] 

Walter Su commented on HDFS-9079:
---------------------------------

I see what you're doing. Some comments.
1. setupPipelineForAppendOrRecovery() will trim bad nodes. When 
nodes.length==0, the failed streamer won't call updateBlockForPipeline(). 
That's one reason you need HDFS-9040.
{code}
   LocatedBlock updateBlockForPipeline() throws IOException {
-    if (LOG.isDebugEnabled()) {
-      LOG.debug("updateBlockForPipeline(), " + this);
+    if (!getErrorState().hasExternalErrorOnly()) {
+      // If the streamer itself has encountered an error, bump GS
+      coordinator.addEvent(new BlockMetadataCoordinator.DNFailureEvent(index));
+      Preconditions.checkState(coordinator.getProposedGenStamp()
+          >= block.getGenerationStamp());
     }
{code}

2. The old way calls {{updatePipeline}} to refresh storedBlock once one 
streamer failed. The internal block of the failed streamer has old GS so it 
won't accepted by blockReport.
The new way delays {{updatePipeline}}. One failure doesn't call it, only 
endBlock() will. Assume one failed block has gs=1001, one failed block has 
gs=1002, the others have gs=1003. The storedBlock has gs=1001, right? Assume 
client gets killed before endBlock(). Now every blocks can be accepted by 
blockReport. It affects lease recovery's judgement. since lease recovery need 6 
healthy blocks so it can start.
But I think it can be fixed easily. If a block with gs=1003 reported, the 
blocks with gs=1001,1002 are bad obviously. So no problem.

3. updatePipeline() is called when overlapping failures finally get handled, or 
just before endBlock()? I'm confused.

> Erasure coding: preallocate multiple generation stamps and serialize updates 
> from data streamers
> ------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-9079
>                 URL: https://issues.apache.org/jira/browse/HDFS-9079
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>    Affects Versions: HDFS-7285
>            Reporter: Zhe Zhang
>            Assignee: Zhe Zhang
>         Attachments: HDFS-9079-HDFS-7285.00.patch
>
>
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4) 
> Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6) 
> Updates block on NN
> {code}
> To simplify the above we can preallocate GS when NN creates a new striped 
> block group ({{FSN#createNewBlock}}). For each new striped block group we can 
> reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can 
> be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we 
> shouldn't try to further recover anyway.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to