[
https://issues.apache.org/jira/browse/HDFS-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231969#comment-14231969
]
Benoy Antony commented on HDFS-7466:
------------------------------------
Two ways to do this :
Approach 1: mover/balancer to query each data node to obtain this value. This
can be done via the http port and invoking the "conf" servlet on the datanode .
This has the drawback that the mover/balancer needs to contact each of the data
nodes. This implementation can be done as plugin and the default implementation
could be to read from the local configuration. If a cluster doesn't need this
accuracy,
Approach 2: mover/balancer obtains this value from the name node. But this
approach has the drawback that this value needs to be sent in the heartbeat and
name node has to keep track of it. This seems to be an overkill for a value
which will be same in most clusters. Also this value is useful only for
balancer/mover.
I am planning to implement Approach 1 as mover/balancer already communicates
with all the data nodes to schedule the move operations.
> Allow different values for dfs.datanode.balance.max.concurrent.moves per
> datanode
> ---------------------------------------------------------------------------------
>
> Key: HDFS-7466
> URL: https://issues.apache.org/jira/browse/HDFS-7466
> Project: Hadoop HDFS
> Issue Type: Sub-task
> Components: balancer & mover
> Reporter: Benoy Antony
> Assignee: Benoy Antony
>
> It is possible to configure different values for
> _dfs.datanode.balance.max.concurrent.moves_ per datanode. But the value will
> be used by balancer/mover which obtains the value from its own configuration.
> The correct approach will be to obtain the value from the datanode itself.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)