[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16317933#comment-16317933 ]
Jianfei Jiang edited comment on HDFS-12935 at 1/9/18 7:49 AM: -------------------------------------------------------------- Thanks [~brahmareddy] for your suggestions. * {{isAtLeastOneActive}} was added just for error hint and not necessary functionally. I will remove it. * Fail info will be added to sysout. * {{checkOperation(OperationCategory.READ);}} changed to {{checkOperation(OperationCategory.WRITE)}} * {{metasave}} is fixed. I think {{listOpenFiles }} is not needed to be fixed as the code following. Not all namenodes are called in {{listOpenFiles }} like other commands. {code:java} if (isHaEnabled) { ProxyAndInfo<ClientProtocol> proxy = NameNodeProxies.createNonHAProxy( dfsConf, HAUtil.getAddressOfActive(getDFS()), ClientProtocol.class, UserGroupInformation.getCurrentUser(), false); openFilesRemoteIterator = new OpenFilesIterator(proxy.getProxy(), FsTracer.get(dfsConf)); } {code} * Patch for {{branch-2}} is will be made. Jira for testcase improvement will be created as well. Please review. Thanks was (Author: jiangjianfei): Thanks [~brahmareddy] for your suggestions. * {{isAtLeastOneActive}} was added just for error hint and not necessary functionally. I will remove it. * Fail info will be added to sysout. * {{checkOperation(OperationCategory.READ);}} changed to {{checkOperation(OperationCategory.WRITE)}} * {{metasave}} is fixed. I think {{listOpenFiles }} is not needed to be fixed as the code following. Not all namenodes are called in {{listOpenFiles }} like other commands. {code:java} if (isHaEnabled) { ProxyAndInfo<ClientProtocol> proxy = NameNodeProxies.createNonHAProxy( dfsConf, HAUtil.getAddressOfActive(getDFS()), ClientProtocol.class, UserGroupInformation.getCurrentUser(), false); openFilesRemoteIterator = new OpenFilesIterator(proxy.getProxy(), FsTracer.get(dfsConf)); } {code} * Patch for {{branch-2}} is will be made. Please review. Thanks > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > --------------------------------------------------------------------------------- > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools > Affects Versions: 3.0.0-beta1, 3.0.0 > Reporter: Jianfei Jiang > Assignee: Jianfei Jiang > Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, > HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS_12935.001.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org