[
https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14909979#comment-14909979
]
Walter Su commented on HDFS-9079:
---------------------------------
I see what you're doing. Some comments.
1. setupPipelineForAppendOrRecovery() will trim bad nodes. When
nodes.length==0, the failed streamer won't call updateBlockForPipeline().
That's one reason you need HDFS-9040.
{code}
LocatedBlock updateBlockForPipeline() throws IOException {
- if (LOG.isDebugEnabled()) {
- LOG.debug("updateBlockForPipeline(), " + this);
+ if (!getErrorState().hasExternalErrorOnly()) {
+ // If the streamer itself has encountered an error, bump GS
+ coordinator.addEvent(new BlockMetadataCoordinator.DNFailureEvent(index));
+ Preconditions.checkState(coordinator.getProposedGenStamp()
+ >= block.getGenerationStamp());
}
{code}
2. The old way calls {{updatePipeline}} to refresh storedBlock once one
streamer failed. The internal block of the failed streamer has old GS so it
won't accepted by blockReport.
The new way delays {{updatePipeline}}. One failure doesn't call it, only
endBlock() will. Assume one failed block has gs=1001, one failed block has
gs=1002, the others have gs=1003. The storedBlock has gs=1001, right? Assume
client gets killed before endBlock(). Now every blocks can be accepted by
blockReport. It affects lease recovery's judgement. since lease recovery need 6
healthy blocks so it can start.
But I think it can be fixed easily. If a block with gs=1003 reported, the
blocks with gs=1001,1002 are bad obviously. So no problem.
3. updatePipeline() is called when overlapping failures finally get handled, or
just before endBlock()? I'm confused.
> Erasure coding: preallocate multiple generation stamps and serialize updates
> from data streamers
> ------------------------------------------------------------------------------------------------
>
> Key: HDFS-9079
> URL: https://issues.apache.org/jira/browse/HDFS-9079
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Affects Versions: HDFS-7285
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
> Attachments: HDFS-9079-HDFS-7285.00.patch
>
>
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4)
> Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6)
> Updates block on NN
> {code}
> To simplify the above we can preallocate GS when NN creates a new striped
> block group ({{FSN#createNewBlock}}). For each new striped block group we can
> reserve {{NUM_PARITY_BLOCKS}} GS's. Then steps 1~3 in the above sequence can
> be saved. If more than {{NUM_PARITY_BLOCKS}} errors have happened we
> shouldn't try to further recover anyway.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)