LeonGao91 opened a new pull request #2483:
URL: https://github.com/apache/hadoop/pull/2483


   Normally the most important purpose for HDFS balancer is to reduce the top 
used node to prevent datanode usage from being too high.
   
   Currently, balancer almost randomly picks nodes as sources regardless of 
usage, which makes it slow to bring down the top used datanodes in the cluster, 
when there are less underutilized nodes in the cluster (consider expansion).
   
   We can add an option to prefer top used nodes first in each iteration.


----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to