[
https://issues.apache.org/jira/browse/HDFS-9079?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15011892#comment-15011892
]
Uma Maheswara Rao G commented on HDFS-9079:
-------------------------------------------
# {quote}So 8 should be sufficient.{quote} you mean 3 ?
# {quote}or higher GS than the NN copy. {quote}
DN report the blocks to standby node as well. Currently depending on genstamp
standby node takes the decision whether to process that block report or not. if
genstamp is higher than it knows, it will not process it will postpone it.
by that time if other file created and new genstamp persisted in NN, then
standby may accept that block. but we can't rely on other file creations etc.
Can you look once BlockManager#processReportedBlock
{code}
if (shouldPostponeBlocksFromFuture &&
namesystem.isGenStampInFuture(block)) {
queueReportedBlock(storageInfo, block, reportedState,
QUEUE_REASON_FUTURE_GENSTAMP);
return null;
}
{code}
And I remember this queued messages will be tried to process when processing
the addNewBlock and updateBlocks as by that time genstamp should have been
synced already at standby. But here in this case I am think when it will get
chance to process that messages at standby.
> Erasure coding: preallocate multiple generation stamps and serialize updates
> from data streamers
> ------------------------------------------------------------------------------------------------
>
> Key: HDFS-9079
> URL: https://issues.apache.org/jira/browse/HDFS-9079
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: erasure-coding
> Affects Versions: HDFS-7285
> Reporter: Zhe Zhang
> Assignee: Zhe Zhang
> Attachments: HDFS-9079-HDFS-7285.00.patch, HDFS-9079.01.patch,
> HDFS-9079.02.patch, HDFS-9079.03.patch, HDFS-9079.04.patch,
> HDFS-9079.05.patch, HDFS-9079.06.patch, HDFS-9079.07.patch,
> HDFS-9079.08.patch, HDFS-9079.09.patch, HDFS-9079.10.patch, HDFS-9079.11.patch
>
>
> A non-striped DataStreamer goes through the following steps in error handling:
> {code}
> 1) Finds error => 2) Asks NN for new GS => 3) Gets new GS from NN => 4)
> Applies new GS to DN (createBlockOutputStream) => 5) Ack from DN => 6)
> Updates block on NN
> {code}
> With multiple streamer threads run in parallel, we need to correctly handle a
> large number of possible combinations of interleaved thread events. For
> example, {{streamer_B}} starts step 2 in between events {{streamer_A.2}} and
> {{streamer_A.3}}.
> HDFS-9040 moves steps 1, 2, 3, 6 from streamer to {{DFSStripedOutputStream}}.
> This JIRA proposes some further optimizations based on HDFS-9040:
> # We can preallocate GS when NN creates a new striped block group
> ({{FSN#createNewBlock}}). For each new striped block group we can reserve
> {{NUM_PARITY_BLOCKS}} GS's. If more than {{NUM_PARITY_BLOCKS}} errors have
> happened we shouldn't try to further recover anyway.
> # We can use a dedicated event processor to offload the error handling logic
> from {{DFSStripedOutputStream}}, which is not a long running daemon.
> # We can limit the lifespan of a streamer to be a single block. A streamer
> ends either after finishing the current block or when encountering a DN
> failure.
> With the proposed change, a {{StripedDataStreamer}}'s flow becomes:
> {code}
> 1) Finds DN error => 2) Notify coordinator (async, not waiting for response)
> => terminates
> 1) Finds external error => 2) Applies new GS to DN (createBlockOutputStream)
> => 3) Ack from DN => 4) Notify coordinator (async, not waiting for response)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)