[ 
https://issues.apache.org/jira/browse/HDFS-9940?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15198747#comment-15198747
 ] 

Yongjun Zhang commented on HDFS-9940:
-------------------------------------

HI [~szetszwo],

Thanks for the explanation.

Currently {{dfs.datanode.balance.max.concurrent.moves}} is a single config 
that's shared between DN and Balancer.  It has caused confusion and trouble in 
the field. The jira here is try to make it easier and less error-prone for user.

HDFS-7466 suggests to let Balancer to query DN for the config, so that we can 
allow each DN to have its own config.  The approach proposed there states that 
the default implementation doesn't need to query datanode:
{quote}
Approach 1: mover/balancer to query each data node to obtain this value. This 
can be done via the http port and invoking the "conf" servlet on the datanode . 
This has the drawback that the mover/balancer needs to contact each of the data 
nodes. This implementation can be done as plugin and the default implementation 
could be to read from the local configuration. If a cluster doesn't need this 
accuracy,
{quote}

Current users are still use the default implementation, which is what we try to 
address here. Even if HDFS-7466 is in place, I assume most users will still use 
the default implementation. Do you agree that we can address HDFS-9940 first, 
then address HDFS-7466 on top of HDFS-9940?

Would you please comment on the two-config approach I proposed earlier? Though 
there is the risk for admin to use wrong value you mentioned,  it can be 
quickly remedied by stopping balancer. Also, admin could make a mistake 
configuring the original value too. So I think the concern of the risk should 
probably not beat the need for making this config more supportable as proposed 
here.

More thoughts?

Thanks.





> Rename dfs.balancer.max.concurrent.moves to avoid confusion
> -----------------------------------------------------------
>
>                 Key: HDFS-9940
>                 URL: https://issues.apache.org/jira/browse/HDFS-9940
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: balancer & mover
>    Affects Versions: 2.6.0
>            Reporter: John Zhuge
>            Assignee: John Zhuge
>            Priority: Minor
>              Labels: supportability
>             Fix For: 2.8.0
>
>
> It is very confusing for both Balancer and Datanode to use the same property 
> {{dfs.datanode.balance.max.concurrent.moves}}. It is especially so for the 
> Balancer because the property has "datanode" in the name string. Many 
> customers forget to set the property for the Balancer.
> Change the Balancer to use a new property 
> {{dfs.balancer.max.concurrent.moves}}.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to