[
https://issues.apache.org/jira/browse/HDFS-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904430#comment-13904430
]
Alexey Kovyrin commented on HDFS-5958:
--------------------------------------
[~sureshms], here is a piece of my log from the balancer:
https://gist.github.com/kovyrin/9077741/raw/a30429b213fc4a5faca40f96c54f01d52c60706e/gistfile1.txt
Here is a screenshot with all the nodes in the cluster:
http://snap.kovyrin.net/Hadoop_NameNode%C2%A0ops01.dal05.swiftype.net_8020-20140218-141308.jpg
name to address map:
{code}
10.84.56.2 work01
10.60.120.8 work02
10.84.56.10 work03
10.84.56.12 logs01
10.80.72.204 backup01
{code}
> One very large node in a cluster prevents balancer from balancing data
> ----------------------------------------------------------------------
>
> Key: HDFS-5958
> URL: https://issues.apache.org/jira/browse/HDFS-5958
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: balancer
> Affects Versions: 2.2.0
> Environment: Hadoop cluster with 4 nodes: 3 with 500Gb drives and one
> with 4Tb drive.
> Reporter: Alexey Kovyrin
>
> In a cluster with a set of small nodes and one much larger node balancer
> always selects the large node as the target even though it already has a copy
> of each block in the cluster.
> This causes the balancer to enter an infinite loop and stop balancing other
> nodes because each balancing iteration selects the same target and then could
> not find a single block to move.
--
This message was sent by Atlassian JIRA
(v6.1.5#6160)