John Zhuge created HDFS-10289:
---------------------------------
Summary: Balancer configures DNs directly
Key: HDFS-10289
URL: https://issues.apache.org/jira/browse/HDFS-10289
Project: Hadoop HDFS
Issue Type: Improvement
Components: balancer & mover
Affects Versions: 2.6.0
Reporter: John Zhuge
Assignee: John Zhuge
Priority: Critical
Balancer directly configures the 2 balance-related properties (bandwidthPerSec
and concurrentMoves) on the DNs involved.
Details:
* Before each balancing iteration, set the properties on all DNs involved in
the current iteration.
* The DN property changes will not survive restart.
* Balancer gets the property values from command line or its config file.
* Need new DN APIs to query and set the 2 properties.
* No need to edit the config file on each DN or run {{hdfs dfsadmin
-setBalancerBandwidth}} to configure every DN in the cluster.
Pros:
* Improve ease of use because all configurations are done at one place, the
balancer. We saw many customers often forgot to set concurrentMoves properly
since it is required on both DN and Balancer.
* Support new DNs added between iterations
* Handle DN restarts between iterations
* May be able to dynamically adjust the thresholds in different iterations.
Don't know how useful though.
Cons:
* New DN property API
* A malicious/misconfigured balancer may overwhelm DNs. {{hdfs dfsadmin
-setBalancerBandwidth}} has the same issue. Also Balancer can only be run by
admin.
Questions:
* Can we create {{BalancerConcurrentMovesCommand}} similar to
{{BalancerBandwidthCommand}}? Can Balancer use them directly without going
through NN?
One proposal to implement HDFS-7466 calls for an API to query DN properties. DN
Conf Servlet returns all config properties. It does not return individual
property and it does not return the value set by {{hdfs dfsadmin
-setBalancerBandwidth}}.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)