[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316140#comment-16316140 ]
Brahma Reddy Battula commented on HDFS-12935: --------------------------------------------- Thanks for updating the patch. * Please remove the following snippet from the {{setblancerwidthcommand}},as {{dfs.setBalancerBandwidth(bandwidth);}} itself connect to Active.? {code} Configuration dfsConf = dfs.getConf(); URI dfsUri = dfs.getUri(); boolean isHaEnabled = HAUtilClient.isLogicalUri(dfsConf, dfsUri); if (isHaEnabled) { String nsId = dfsUri.getHost(); List<ClientProtocol> namenodes = HAUtil.getProxiesForAllNameNodesInNameservice(dfsConf, nsId); if (!HAUtil.isAtLeastOneActive(namenodes)) { throw new IOException("Cannot set balancer bandwidth " + "with no NameNode active"); } } {code} * can we've the cause here,as we'll not be knowing from which NN it's thrown.Like below for all commands {code} }catch (IOException ioe){ System.out.println("Refresh call queue Failed for " + proxy.getAddress()); exceptions.add(ioe); } {code} * {{checkOperation(OperationCategory.READ);}} change to {{checkOperation(OperationCategory.WRITE)}} * can you add for {{listopenFiles}} and {{metasave}} Sorry I missed above two. can you update the {{branch-2}} patch also. *Testcase improvement:* Can be handle the following for all the commands in seperate jira. {code} @Test (timeout = 30000) public void testSetBalancerBandwidthNN1DownNN2Up() throws Exception { String[] command = { "-setBalancerBandwidth", "10" }; String message = "Balancer bandwidth is set to 10"; testExecuteDFSAdminCommand(0, command, message); } @Test (timeout = 30000) public void testSetBalancerBandwidthNN1DownNN2Down() throws Exception { String[] command = { "-setBalancerBandwidth", "10" }; String message = "Balancer bandwidth is set to 10"; testExecuteDFSAdminCommand(2, command, message); } private void testExecuteDFSAdminCommand(int nnIndex, String[] command, String message) throws Exception { setUpHaCluster(false); switch (nnIndex) { case 0: cluster.getDfsCluster().shutdownNameNode(0); cluster.getDfsCluster().transitionToActive(1); break; case 1: cluster.getDfsCluster().shutdownNameNode(1); cluster.getDfsCluster().transitionToActive(0); break; case 2: cluster.getDfsCluster().shutdownNameNode(0); cluster.getDfsCluster().shutdownNameNode(1); break; default: } int exitCode = admin.run(command); if (nnIndex != 2) { assertEquals(err.toString().trim(), 0, exitCode); assertOutputMatches(message + newLine); } else { assertNotEquals(err.toString().trim(), 0, exitCode); assertOutputNotMatches(message + newLine); } } {code} > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > --------------------------------------------------------------------------------- > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Affects Versions: 3.0.0-beta1, 3.0.0 > Reporter: Jianfei Jiang > Assignee: Jianfei Jiang > Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, > HDFS-12935.004.patch, HDFS_12935.001.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org