[ 
https://issues.apache.org/jira/browse/HDFS-7466?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14231969#comment-14231969
 ] 

Benoy Antony commented on HDFS-7466:
------------------------------------

Two ways to do this :

Approach 1:  mover/balancer to query each data node to obtain this value. This 
can be done via the http port and invoking the "conf" servlet on the datanode . 
This has the drawback that the mover/balancer needs to contact each of the data 
nodes. This implementation can be done as plugin and the default implementation 
could be to read from the local configuration. If a cluster doesn't need this 
accuracy, 

Approach 2: mover/balancer obtains this value from the name node. But this 
approach has the drawback that this value needs to be sent in the heartbeat and 
name node has to keep track of it.  This seems to be an overkill for a value 
which will be same in most clusters.  Also this value is useful only for 
balancer/mover. 

I am planning to implement Approach 1 as mover/balancer already communicates 
with all the data nodes to schedule the move operations.

> Allow different values for dfs.datanode.balance.max.concurrent.moves per 
> datanode
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-7466
>                 URL: https://issues.apache.org/jira/browse/HDFS-7466
>             Project: Hadoop HDFS
>          Issue Type: Sub-task
>          Components: balancer & mover
>            Reporter: Benoy Antony
>            Assignee: Benoy Antony
>
> It is possible to configure different values for  
> _dfs.datanode.balance.max.concurrent.moves_ per datanode.  But the value will 
> be used by balancer/mover which obtains the value from its own configuration. 
> The correct approach will be to obtain the value from the datanode itself.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to