[
https://issues.apache.org/jira/browse/HDFS-14894?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16946117#comment-16946117
]
Kihwal Lee commented on HDFS-14894:
-----------------------------------
We could sort nodes based on the utilization, so that highly utilized nodes get
scheduled first.
> Add balancer parameter to balance top N used nodes
> --------------------------------------------------
>
> Key: HDFS-14894
> URL: https://issues.apache.org/jira/browse/HDFS-14894
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: balancer & mover
> Reporter: Leon Gao
> Assignee: Leon Gao
> Priority: Major
>
> We sometimes see a few of our datanodes reach very high usage (due to various
> reasons) and we need to reduce their usage in an urgent situation.
> We see two ways to achieve it currently,
> -Calculate and reset balancing threshold.
> -Pick nodes manually according to usage stats and put them in a file and use
> `-resource` flag.
> However, both of them are not very intuitive or too much manual work in an
> urgent close-to-outage situation. Add a small feature to automatically pick
> top N used hosts will be a straightforward option, for example `-top 10` to
> only target top 10 used datanodes.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]