[
https://issues.apache.org/jira/browse/HDFS-1172?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12871287#action_12871287
]
Todd Lipcon commented on HDFS-1172:
-----------------------------------
I think there are a few solutions to this:
- HDFS-611 should help a lot. We often have seen this issue after doing a
largescale decrease in replication count, or a large directory removal, since
the block deletions hold up the blockReceived call in DN.offerService. But this
isn't a full solution - there are still other ways in which the DN can be
slower at acking a new block than the client is in calling completeFile
- Scott's solution of making the primary DN send the blockReceived on account
of all DNs would work, but sounds complicated, expecially in the failure cases
(eg what if the primary DN fails just before sending the RPC? Do we lose all
the replicas? No good!)
- UnderReplicatedBlocks could be augmented to carry a dontProcessUntil
timestamp. When we check replication in response to a completeFile, we can mark
the neededReplications with a "don't process until N seconds from now" which
causes them to get skipped over by the replication monitor thread until a later
time. This should give the DNs a bit of leeway to report the blocks, while not
changing the control flow or distributed parts at all.
Dhruba's workaround of upping min replication indeed helps, but as he said,
it's at a great cost to the client, *especially* in the cases where it would
help (eg if one DN is 10 seconds slow)
> Blocks in newly completed files are considered under-replicated too quickly
> ---------------------------------------------------------------------------
>
> Key: HDFS-1172
> URL: https://issues.apache.org/jira/browse/HDFS-1172
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.21.0
> Reporter: Todd Lipcon
>
> I've seen this for a long time, and imagine it's a known issue, but couldn't
> find an existing JIRA. It often happens that we see the NN schedule
> replication on the last block of files very quickly after they're completed,
> before the other DNs in the pipeline have a chance to report the new block.
> This results in a lot of extra replication work on the cluster, as we
> replicate the block and then end up with multiple excess replicas which are
> very quickly deleted.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.