[ 
https://issues.apache.org/jira/browse/HDFS-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16946117#comment-16946117
 ] 

Kihwal Lee commented on HDFS-14894:
-----------------------------------

We could sort nodes based on the utilization, so that highly utilized nodes get 
scheduled first. 

> Add balancer parameter to balance top N used nodes
> --------------------------------------------------
>
>                 Key: HDFS-14894
>                 URL: https://issues.apache.org/jira/browse/HDFS-14894
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>            Reporter: Leon Gao
>            Assignee: Leon Gao
>            Priority: Major
>
> We sometimes see a few of our datanodes reach very high usage (due to various 
> reasons) and we need to reduce their usage in an urgent situation.
> We see two ways to achieve it currently,
> -Calculate and reset balancing threshold.
> -Pick nodes manually according to usage stats and put them in a file and use 
> `-resource` flag.
> However, both of them are not very intuitive or too much manual work in an 
> urgent close-to-outage situation. Add a small feature to automatically pick 
> top N used hosts will be a straightforward option, for example `-top 10` to 
> only target top 10 used datanodes.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to