[ 
https://issues.apache.org/jira/browse/HDFS-8824?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15677540#comment-15677540
 ] 

Kihwal Lee commented on HDFS-8824:
----------------------------------

While this will initially increase the efficiency of balancing, it is not 
without a negative side-effect.

Older nodes in a cluster will slowly filled with smaller blocks as time goes 
on. This is accelerated if the cluster is heterogeneous.  The smaller nodes 
will fill up more quickly/frequently and the balancer will move only big blocks 
out of those nodes.  As more balacing happens, those nodes will contain more 
and more small blocks. If sufficient time passes, the blocks on those nodes 
will almost entirely small.

This feature can be enabled for quickly resolving a storage balance issue, but 
long-term use can have unintended side-effect.  Fortunately, we have not 
released any  (other than alpha) with this feature. We can include more 
information in the release note and/or address the issue in the code/config.

> Do not use small blocks for balancing the cluster
> -------------------------------------------------
>
>                 Key: HDFS-8824
>                 URL: https://issues.apache.org/jira/browse/HDFS-8824
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: balancer & mover
>            Reporter: Tsz Wo Nicholas Sze
>            Assignee: Tsz Wo Nicholas Sze
>             Fix For: 2.8.0, 2.7.4, 3.0.0-alpha1
>
>         Attachments: h8824_20150727b.patch, h8824_20150811b.patch
>
>
> Balancer gets datanode block lists from NN and then move the blocks in order 
> to balance the cluster.  It should not use the blocks with small size since 
> moving the small blocks generates a lot of overhead and the small blocks do 
> not help balancing the cluster much.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to