[jira] [Commented] (HDFS-5958) One very large node in a cluster prevents balancer from balancing data

Alexey Kovyrin (JIRA) Tue, 18 Feb 2014 11:19:27 -0800

    [ 
https://issues.apache.org/jira/browse/HDFS-5958?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13904430#comment-13904430
 ]


Alexey Kovyrin commented on HDFS-5958:
--------------------------------------

[~sureshms], here is a piece of my log from the balancer: 
https://gist.github.com/kovyrin/9077741/raw/a30429b213fc4a5faca40f96c54f01d52c60706e/gistfile1.txt

Here is a screenshot with all the nodes in the cluster: 
http://snap.kovyrin.net/Hadoop_NameNode%C2%A0ops01.dal05.swiftype.net_8020-20140218-141308.jpg

name to address map:
{code}
10.84.56.2    work01
10.60.120.8   work02
10.84.56.10   work03
10.84.56.12   logs01
10.80.72.204  backup01
{code}


> One very large node in a cluster prevents balancer from balancing data
> ----------------------------------------------------------------------
>
>                 Key: HDFS-5958
>                 URL: https://issues.apache.org/jira/browse/HDFS-5958
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: balancer
>    Affects Versions: 2.2.0
>         Environment: Hadoop cluster with 4 nodes: 3 with 500Gb drives and one 
> with 4Tb drive.
>            Reporter: Alexey Kovyrin
>
> In a cluster with a set of small nodes and one much larger node balancer 
> always selects the large node as the target even though it already has a copy 
> of each block in the cluster.
> This causes the balancer to enter an infinite loop and stop balancing other 
> nodes because each balancing iteration selects the same target and then could 
> not find a single block to move.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (HDFS-5958) One very large node in a cluster prevents balancer from balancing data

Reply via email to