[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16294547#comment-16294547
 ] 

Jianfei Jiang commented on HDFS-12935:
--------------------------------------

In my opinion, in the occasion that one namenode is up, no matter which is up, 
the result of command should be the same but not ambiguous. There are two ways 
to fix up:
(1) Command no more useful as long as one namenode is down
(2) Command still useful but throw exception for the connection failure to the 
down namenode.

I am not certain which one is more suitable as some of the commands I am not 
familiar to. I have made a patch for the setBalancerBandwidth command. I 
decided to use way two to make it useful when at least one namenode is up.

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-12935
>                 URL: https://issues.apache.org/jira/browse/HDFS-12935
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 3.0.0-beta1
>            Reporter: Jianfei Jiang
>             Fix For: 3.0.0-beta1
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions: (1)nn1 up and nn2 down, (2)nn1 down 
> and nn2 up. These two occasions should be equivalent. However, some of the 
> DFSAdmin commands will have ambiguous results. The commands can be send 
> successfully to the up namenode and are always functionally useful only when 
> nn1 is up regardless of exception (IOException when connecting to the down 
> namenode). See the following command "hdfs dfsadmin setBalancerBandwidth" 
> which aim to set balancer bandwidth value for datanodes as an example. It 
> works and all the datanodes can get the setting values only when nn1 is up. 
> If only nn2 is up, the command throws exception directly and no datanode get 
> the bandwidth setting. Approximately ten DFSAdmin commands use the similar 
> logical process and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to