[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355839#comment-16355839 ] Brahma Reddy Battula edited comment on HDFS-12935 at 2/7/18 6:13 PM: - {{Committed to trunk}} and branch-3.0 .[~jiangjianfei] thanks for your contribution, appreciate your dedication towards close this Jira. Re-uploaded the branch-2 patch to run jenkins. was (Author: brahmareddy): {{Committed to {{trunk}} and branch-3.0 .[~jiangjianfei] thanks for your contribution, appreciate your dedication towards close this Jira.}} Re-uploaded the branch-2 patch to run jenkins. > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0 >Reporter: Jianfei Jiang >Assignee: Jianfei Jiang >Priority: Major > Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, > HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, > HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, > HDFS-12935.008.patch, HDFS-12935.009-branch-2.patch, > HDFS-12935.009-branch.2.patch, HDFS-12935.009.patch, HDFS_12935.001.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328857#comment-16328857 ] Jianfei Jiang edited comment on HDFS-12935 at 1/17/18 3:01 PM: --- Patch 008: Rebase trunk. was (Author: jiangjianfei): Rebase trunk. > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0 >Reporter: Jianfei Jiang >Assignee: Jianfei Jiang >Priority: Major > Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, > HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, > HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, > HDFS-12935.008.patch, HDFS_12935.001.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328339#comment-16328339 ] Jianfei Jiang edited comment on HDFS-12935 at 1/17/18 12:31 PM: Thanks [~brahmareddy] to point out. I will updated the patch to handle {{listOpenFiles}}. was (Author: jiangjianfei): Thanks [~brahmareddy] to point out. I have updated the patch to handle \{{listOpenFiles}}. Please review if available. > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0 >Reporter: Jianfei Jiang >Assignee: Jianfei Jiang >Priority: Major > Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, > HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, > HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, > HDFS_12935.001.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317933#comment-16317933 ] Jianfei Jiang edited comment on HDFS-12935 at 1/9/18 7:49 AM: -- Thanks [~brahmareddy] for your suggestions. * {{isAtLeastOneActive}} was added just for error hint and not necessary functionally. I will remove it. * Fail info will be added to sysout. * {{checkOperation(OperationCategory.READ);}} changed to {{checkOperation(OperationCategory.WRITE)}} * {{metasave}} is fixed. I think {{listOpenFiles }} is not needed to be fixed as the code following. Not all namenodes are called in {{listOpenFiles }} like other commands. {code:java} if (isHaEnabled) { ProxyAndInfo proxy = NameNodeProxies.createNonHAProxy( dfsConf, HAUtil.getAddressOfActive(getDFS()), ClientProtocol.class, UserGroupInformation.getCurrentUser(), false); openFilesRemoteIterator = new OpenFilesIterator(proxy.getProxy(), FsTracer.get(dfsConf)); } {code} * Patch for {{branch-2}} is will be made. Jira for testcase improvement will be created as well. Please review. Thanks was (Author: jiangjianfei): Thanks [~brahmareddy] for your suggestions. * {{isAtLeastOneActive}} was added just for error hint and not necessary functionally. I will remove it. * Fail info will be added to sysout. * {{checkOperation(OperationCategory.READ);}} changed to {{checkOperation(OperationCategory.WRITE)}} * {{metasave}} is fixed. I think {{listOpenFiles }} is not needed to be fixed as the code following. Not all namenodes are called in {{listOpenFiles }} like other commands. {code:java} if (isHaEnabled) { ProxyAndInfo proxy = NameNodeProxies.createNonHAProxy( dfsConf, HAUtil.getAddressOfActive(getDFS()), ClientProtocol.class, UserGroupInformation.getCurrentUser(), false); openFilesRemoteIterator = new OpenFilesIterator(proxy.getProxy(), FsTracer.get(dfsConf)); } {code} * Patch for {{branch-2}} is will be made. Please review. Thanks > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0-beta1, 3.0.0 >Reporter: Jianfei Jiang >Assignee: Jianfei Jiang > Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, > HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS_12935.001.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302178#comment-16302178 ] Jianfei Jiang edited comment on HDFS-12935 at 12/23/17 9:58 AM: The failed testcases are not related. I have re-run them and all passed. [~brahmareddy] Please review. Thanks a lot. [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s - in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s - in org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 s - in org.apache.hadoop.hdfs.TestEncryptionZones [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4 was (Author: jiangjianfei): The failed testcases are not related. I have re-run them and all passed. [~brahmareddy], [~shahrs87], [~zhenyi] Please review. Thanks a lot. [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s - in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s - in org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 s - in org.apache.hadoop.hdfs.TestEncryptionZones [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4 > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0-beta1, 3.0.0 >
[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302178#comment-16302178 ] Jianfei Jiang edited comment on HDFS-12935 at 12/23/17 3:46 AM: The failed testcases are not related. I have re-run them and all passed. [~brahmareddy], [~shahrs87], [~zhenyi] Please review. Thanks a lot. [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s - in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s - in org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 s - in org.apache.hadoop.hdfs.TestEncryptionZones [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4 was (Author: jiangjianfei): The failed testcases are not related. I have re-run them and all passed. Please review [~brahmareddy] [~shahrs87] [~zhenyi]. Thanks a lot. [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s - in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s - in org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 s - in org.apache.hadoop.hdfs.TestEncryptionZones [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4 > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions:
[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302178#comment-16302178 ] Jianfei Jiang edited comment on HDFS-12935 at 12/23/17 3:45 AM: The failed testcases are not related. I have re-run them and all passed. Please review [~brahmareddy] [~shahrs87] [~zhenyi]. Thanks a lot. [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s - in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s - in org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 s - in org.apache.hadoop.hdfs.TestEncryptionZones [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4 was (Author: jiangjianfei): The failed testcases are not related. I have re-run them and all passed. Please review [~brahmareddy]. Thanks a lot. [INFO] --- [INFO] T E S T S [INFO] --- [INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure [INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s - in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA [INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090 [INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030 [INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s - in org.apache.hadoop.hdfs.TestRenameWhileOpen [INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones [INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 s - in org.apache.hadoop.hdfs.TestEncryptionZones [INFO] [INFO] Results: [INFO] [WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4 > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0-beta1, 3.0.0 >
[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299786#comment-16299786 ] Jianfei Jiang edited comment on HDFS-12935 at 12/21/17 9:57 AM: Add a patch to fix up 10 commands was (Author: jiangjianfei): fix up the following commands: setSafeMode saveNamespace restoreFailedStorage refreshNodes setBalancerBandwidth finalizeUpgrade refreshServiceAcl refreshUserToGroupsMappings refreshSuperUserGroupsConfiguration refreshCallQueue > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0-beta1 >Reporter: Jianfei Jiang >Assignee: Jianfei Jiang > Attachments: HDFS_12935.001.patch, HDFS_12935.002.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up
[ https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299596#comment-16299596 ] Jianfei Jiang edited comment on HDFS-12935 at 12/21/17 5:39 AM: Thanks [~brahmareddy] so much for your comment and adding me as a contributor. It's my honor. I have read HowToContribute and learnt a lot. After viewing the relative code. The following commands may have the some issue and I am going to handle. setSafeMode saveNamespace restoreFailedStorage refreshNodes setBalancerBandwidth finalizeUpgrade refreshServiceAcl refreshUserToGroupsMappings refreshSuperUserGroupsConfiguration refreshCallQueue was (Author: jiangjianfei): Thanks [~brahmareddy] so much for your comment and adding me as a contributor. It is my pleasure. I have read HowToContribute and learnt a lot. After viewing the relative code. The following commands may have the some issue and I am going to handle. setSafeMode saveNamespace restoreFailedStorage refreshNodes setBalancerBandwidth finalizeUpgrade refreshServiceAcl refreshUserToGroupsMappings refreshSuperUserGroupsConfiguration refreshCallQueue > Get ambiguous result for DFSAdmin command in HA mode when only one namenode > is up > - > > Key: HDFS-12935 > URL: https://issues.apache.org/jira/browse/HDFS-12935 > Project: Hadoop HDFS > Issue Type: Bug > Components: tools >Affects Versions: 3.0.0-beta1 >Reporter: Jianfei Jiang >Assignee: Jianfei Jiang > Attachments: HDFS_12935.001.patch > > > In HA mode, if one namenode is down, most of functions can still work. When > considering the following two occasions: > (1)nn1 up and nn2 down > (2)nn1 down and nn2 up > These two occasions should be equivalent. However, some of the DFSAdmin > commands will have ambiguous results. The commands can be send successfully > to the up namenode and are always functionally useful only when nn1 is up > regardless of exception (IOException when connecting to the down namenode > nn2). If only nn2 is up, the commands have no use at all and only exception > to connect nn1 can be found. > See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to > set balancer bandwidth value for datanodes as an example. It works and all > the datanodes can get the setting values only when nn1 is up. If only nn2 is > up, the command throws exception directly and no datanode get the bandwidth > setting. Approximately ten DFSAdmin commands use the similar logical process > and may be ambiguous. > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345 > *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820* > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei02:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2 > active > [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234 > setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to > jiangjianfei01:9820 failed on connection exception: > java.net.ConnectException: Connection refused; For more details see: > http://wiki.apache.org/hadoop/ConnectionRefused > [root@jiangjianfei01 ~]# -- This message was sent by Atlassian JIRA (v6.4.14#64029) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org