[
https://issues.apache.org/jira/browse/HDFS-1391?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
dhruba borthakur updated HDFS-1391:
-----------------------------------
Attachment: excessReplicas2.txt
Patch can also be reviewed at https://reviews.apache.org/r/196/
Merged patch with latest trunk.
At time of exiting safemode, we walk through all the blocks and if a block has
excess replicas we insert into overReplicatedBlocks (we do not delete excess
replicas right then and there). Then we exit safemode. Then the
ReplicationMonitor thread asynchronously process each of the blocks in the
overReplicatedBlocks data structure and determines and deletes excess replicas.
The chooseExcessReplicas method (which can be compute heavy at times) is now is
called without the FSNamesystem lock.
For a cluster with around 110 million blocks, the "bin/hadoop dfsadmin
-safemode leave" command used to take about 9 minutes before this patch. With
this patch, it takes about 55 seconds!
> Exiting safemode takes a long time when there are lots of blocks in the HDFS
> ----------------------------------------------------------------------------
>
> Key: HDFS-1391
> URL: https://issues.apache.org/jira/browse/HDFS-1391
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Reporter: dhruba borthakur
> Assignee: dhruba borthakur
> Attachments: excessReplicas.1_trunk.txt, excessReplicas2.txt
>
>
> When the namenode decides to exit safemode, it acquires the FSNamesystem
> lock and then iterates over all blocks in the blocksmap to determine if any
> block has any excess replicas. This call takes upwards of 5 minutes on a
> cluster that has 100 million blocks. This delays namenode restart to a good
> extent.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.