[
https://issues.apache.org/jira/browse/HDFS-7787?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14318737#comment-14318737
]
Ravi Prakash commented on HDFS-7787:
------------------------------------
By the way, here are guidelines for contributing:
https://wiki.apache.org/hadoop/HowToContribute
Please assign the JIRA to yourself in case you intend to work on it.
> Wrong priorty of replication
> ----------------------------
>
> Key: HDFS-7787
> URL: https://issues.apache.org/jira/browse/HDFS-7787
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 2.6.0
> Environment: 2 namenodes HA, 6 datanodes in two racks
> Reporter: Frode Halvorsen
> Labels: balance, hdfs, replication-performance
>
> Each file has a setting of 3 replicas. split on different racks.
> After a simulated crash of one rack (shutdown of all nodes, deleted
> data-directory an started nodes) and decommssion of one of the nodes in the
> orther rack the replication does not follow 'normal' rules...
> My cluster has appx 25 mill files, and the one node I now try to decommision
> has 9 millions underreplicated blocks, and 3,5 million blocks with 'no live
> replicas'. After a restart of the node, it starts to replicate both types of
> blocks, but after a while, it only repliates under-replicated blocks with
> other live copies. I would think that the 'normal' way to do this would be to
> make sure that all blocks this node keeps the only copy of, should be the
> first to be replicated/balanced ? Another thing, is that this takes
> 'forever'. The rate it's going now it will run for a couple of months before
> I can take down the node for maintance.. It only has appx 250 G of data in
> total ..
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)