[
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316140#comment-16316140
]
Brahma Reddy Battula commented on HDFS-12935:
---------------------------------------------
Thanks for updating the patch.
* Please remove the following snippet from the {{setblancerwidthcommand}},as
{{dfs.setBalancerBandwidth(bandwidth);}} itself connect to Active.?
{code}
Configuration dfsConf = dfs.getConf();
URI dfsUri = dfs.getUri();
boolean isHaEnabled = HAUtilClient.isLogicalUri(dfsConf, dfsUri);
if (isHaEnabled) {
String nsId = dfsUri.getHost();
List<ClientProtocol> namenodes =
HAUtil.getProxiesForAllNameNodesInNameservice(dfsConf, nsId);
if (!HAUtil.isAtLeastOneActive(namenodes)) {
throw new IOException("Cannot set balancer bandwidth " +
"with no NameNode active");
}
}
{code}
* can we've the cause here,as we'll not be knowing from which NN it's
thrown.Like below for all commands
{code}
}catch (IOException ioe){
System.out.println("Refresh call queue Failed for "
+ proxy.getAddress());
exceptions.add(ioe);
}
{code}
* {{checkOperation(OperationCategory.READ);}} change to
{{checkOperation(OperationCategory.WRITE)}}
* can you add for {{listopenFiles}} and {{metasave}}
Sorry I missed above two.
can you update the {{branch-2}} patch also.
*Testcase improvement:* Can be handle the following for all the commands in
seperate jira.
{code}
@Test (timeout = 30000)
public void testSetBalancerBandwidthNN1DownNN2Up() throws Exception {
String[] command = { "-setBalancerBandwidth", "10" };
String message = "Balancer bandwidth is set to 10";
testExecuteDFSAdminCommand(0, command, message);
}
@Test (timeout = 30000)
public void testSetBalancerBandwidthNN1DownNN2Down() throws Exception {
String[] command = { "-setBalancerBandwidth", "10" };
String message = "Balancer bandwidth is set to 10";
testExecuteDFSAdminCommand(2, command, message);
}
private void testExecuteDFSAdminCommand(int nnIndex, String[] command,
String message) throws Exception {
setUpHaCluster(false);
switch (nnIndex) {
case 0:
cluster.getDfsCluster().shutdownNameNode(0);
cluster.getDfsCluster().transitionToActive(1);
break;
case 1:
cluster.getDfsCluster().shutdownNameNode(1);
cluster.getDfsCluster().transitionToActive(0);
break;
case 2:
cluster.getDfsCluster().shutdownNameNode(0);
cluster.getDfsCluster().shutdownNameNode(1);
break;
default:
}
int exitCode = admin.run(command);
if (nnIndex != 2) {
assertEquals(err.toString().trim(), 0, exitCode);
assertOutputMatches(message + newLine);
} else {
assertNotEquals(err.toString().trim(), 0, exitCode);
assertOutputNotMatches(message + newLine);
}
}
{code}
> Get ambiguous result for DFSAdmin command in HA mode when only one namenode
> is up
> ---------------------------------------------------------------------------------
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: tools
> Affects Versions: 3.0.0-beta1, 3.0.0
> Reporter: Jianfei Jiang
> Assignee: Jianfei Jiang
> Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch,
> HDFS-12935.004.patch, HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When
> considering the following two occasions:
> (1)nn1 up and nn2 down
> (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin
> commands will have ambiguous results. The commands can be send successfully
> to the up namenode and are always functionally useful only when nn1 is up
> regardless of exception (IOException when connecting to the down namenode
> nn2). If only nn2 is up, the commands have no use at all and only exception
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to
> set balancer bandwidth value for datanodes as an example. It works and all
> the datanodes can get the setting values only when nn1 is up. If only nn2 is
> up, the command throws exception directly and no datanode get the bandwidth
> setting. Approximately ten DFSAdmin commands use the similar logical process
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to
> jiangjianfei02:9820 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to
> jiangjianfei01:9820 failed on connection exception:
> java.net.ConnectException: Connection refused; For more details see:
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]#
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]