[
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16294547#comment-16294547
]
Jianfei Jiang commented on HDFS-12935:
--------------------------------------
In my opinion, in the occasion that one namenode is up, no matter which is up,
the result of command should be the same but not ambiguous. There are two ways
to fix up:
(1) Command no more useful as long as one namenode is down
(2) Command still useful but throw exception for the connection failure to the
down namenode.
I am not certain which one is more suitable as some of the commands I am not
familiar to. I have made a patch for the setBalancerBandwidth command. I
decided to use way two to make it useful when at least one namenode is up.
> Get ambiguous result for DFSAdmin command in HA mode when only one namenode
> is up
> ---------------------------------------------------------------------------------
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: tools
> Affects Versions: 3.0.0-beta1
> Reporter: Jianfei Jiang
> Fix For: 3.0.0-beta1
>
>
> In HA mode, if one namenode is down, most of functions can still work. When
> considering the following two occasions: (1)nn1 up and nn2 down, (2)nn1 down
> and nn2 up. These two occasions should be equivalent. However, some of the
> DFSAdmin commands will have ambiguous results. The commands can be send
> successfully to the up namenode and are always functionally useful only when
> nn1 is up regardless of exception (IOException when connecting to the down
> namenode). See the following command "hdfs dfsadmin setBalancerBandwidth"
> which aim to set balancer bandwidth value for datanodes as an example. It
> works and all the datanodes can get the setting values only when nn1 is up.
> If only nn2 is up, the command throws exception directly and no datanode get
> the bandwidth setting. Approximately ten DFSAdmin commands use the similar
> logical process and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to
> jiangjianfei02:9820 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to
> jiangjianfei01:9820 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]#
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]