[
https://issues.apache.org/jira/browse/HDFS-10289?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15243385#comment-15243385
]
Ravi Prakash commented on HDFS-10289:
-------------------------------------
Thanks for trying to improve the Balancer John! Does anyone remember why the
Balancer was a separate process from the Namenode, rather than just a thread in
it?
> Balancer configures DNs directly
> --------------------------------
>
> Key: HDFS-10289
> URL: https://issues.apache.org/jira/browse/HDFS-10289
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: balancer & mover
> Affects Versions: 2.6.0
> Reporter: John Zhuge
> Assignee: John Zhuge
> Priority: Critical
>
> Balancer directly configures the 2 balance-related properties
> (bandwidthPerSec and concurrentMoves) on the DNs involved.
> Details:
> * Before each balancing iteration, set the properties on all DNs involved in
> the current iteration.
> * The DN property changes will not survive restart.
> * Balancer gets the property values from command line or its config file.
> * Need new DN APIs to query and set the 2 properties.
> * No need to edit the config file on each DN or run {{hdfs dfsadmin
> -setBalancerBandwidth}} to configure every DN in the cluster.
> Pros:
> * Improve ease of use because all configurations are done at one place, the
> balancer. We saw many customers often forgot to set concurrentMoves properly
> since it is required on both DN and Balancer.
> * Support new DNs added between iterations
> * Handle DN restarts between iterations
> * May be able to dynamically adjust the thresholds in different iterations.
> Don't know how useful though.
> Cons:
> * New DN property API
> * A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin
> -setBalancerBandwidth}} has the same issue. Also Balancer can only be run by
> admin.
> Questions:
> * Can we create {{BalancerConcurrentMovesCommand}} similar to
> {{BalancerBandwidthCommand}}? Can Balancer use them directly without going
> through NN?
> One proposal to implement HDFS-7466 calls for an API to query DN properties.
> DN Conf Servlet returns all config properties. It does not return individual
> property and it does not return the value set by {{hdfs dfsadmin
> -setBalancerBandwidth}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)