[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16316140#comment-16316140
 ] 

Brahma Reddy Battula commented on HDFS-12935:
---------------------------------------------

Thanks for updating the patch.

* Please remove the following snippet from the {{setblancerwidthcommand}},as 
{{dfs.setBalancerBandwidth(bandwidth);}} itself connect to Active.?
{code}
    Configuration dfsConf = dfs.getConf();
    URI dfsUri = dfs.getUri();
    boolean isHaEnabled = HAUtilClient.isLogicalUri(dfsConf, dfsUri);

    if (isHaEnabled) {
      String nsId = dfsUri.getHost();
      List<ClientProtocol> namenodes =
          HAUtil.getProxiesForAllNameNodesInNameservice(dfsConf, nsId);
      if (!HAUtil.isAtLeastOneActive(namenodes)) {
        throw new IOException("Cannot set balancer bandwidth " +
            "with no NameNode active");
      }
    }
{code}
* can we've the cause here,as we'll not be knowing from which NN it's 
thrown.Like below for all commands
{code}
}catch (IOException ioe){
          System.out.println("Refresh call queue Failed for "
              + proxy.getAddress());
          exceptions.add(ioe);
        }
{code}
* {{checkOperation(OperationCategory.READ);}} change to 
{{checkOperation(OperationCategory.WRITE)}}
* can you add for {{listopenFiles}} and {{metasave}}
Sorry I missed above two.
can you update the {{branch-2}} patch also.

*Testcase improvement:* Can be handle the following for all the commands in 
seperate jira.

{code}
@Test (timeout = 30000)
  public void testSetBalancerBandwidthNN1DownNN2Up() throws Exception {
    String[] command = { "-setBalancerBandwidth", "10" };
    String message = "Balancer bandwidth is set to 10";
    testExecuteDFSAdminCommand(0, command, message);
  }

    @Test (timeout = 30000)
  public void testSetBalancerBandwidthNN1DownNN2Down() throws Exception {
    String[] command = { "-setBalancerBandwidth", "10" };
    String message = "Balancer bandwidth is set to 10";
    testExecuteDFSAdminCommand(2, command, message);
  }
  
  private void testExecuteDFSAdminCommand(int nnIndex, String[] command,
      String message) throws Exception {
    setUpHaCluster(false);
    switch (nnIndex) {
      case 0:
        cluster.getDfsCluster().shutdownNameNode(0);
        cluster.getDfsCluster().transitionToActive(1);
        break;
      case 1:
        cluster.getDfsCluster().shutdownNameNode(1);
        cluster.getDfsCluster().transitionToActive(0);
        break;
      case 2:
        cluster.getDfsCluster().shutdownNameNode(0);
        cluster.getDfsCluster().shutdownNameNode(1);
        break;
      default:

    }
    int exitCode = admin.run(command);
    if (nnIndex != 2) {
      assertEquals(err.toString().trim(), 0, exitCode);
      assertOutputMatches(message + newLine);
    } else {
      assertNotEquals(err.toString().trim(), 0, exitCode);
      assertOutputNotMatches(message + newLine);
    }
  }

{code}

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> ---------------------------------------------------------------------------------
>
>                 Key: HDFS-12935
>                 URL: https://issues.apache.org/jira/browse/HDFS-12935
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: tools
>    Affects Versions: 3.0.0-beta1, 3.0.0
>            Reporter: Jianfei Jiang
>            Assignee: Jianfei Jiang
>         Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, 
> HDFS-12935.004.patch, HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to