[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2018-02-07 Thread Brahma Reddy Battula (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16355839#comment-16355839
 ] 

Brahma Reddy Battula edited comment on HDFS-12935 at 2/7/18 6:13 PM:
-

{{Committed to trunk}} and branch-3.0 .[~jiangjianfei] thanks for your 
contribution, appreciate your dedication towards close this Jira.

 

Re-uploaded the branch-2 patch to run jenkins.


was (Author: brahmareddy):
{{Committed to {{trunk}} and branch-3.0 .[~jiangjianfei] thanks for your 
contribution, appreciate your dedication towards close this Jira.}}

 

Re-uploaded the branch-2 patch to run jenkins.

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, 
> HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, 
> HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, 
> HDFS-12935.008.patch, HDFS-12935.009-branch-2.patch, 
> HDFS-12935.009-branch.2.patch, HDFS-12935.009.patch, HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2018-01-17 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328857#comment-16328857
 ] 

Jianfei Jiang edited comment on HDFS-12935 at 1/17/18 3:01 PM:
---

Patch 008: Rebase trunk.


was (Author: jiangjianfei):
Rebase trunk.

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, 
> HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, 
> HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, 
> HDFS-12935.008.patch, HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2018-01-17 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16328339#comment-16328339
 ] 

Jianfei Jiang edited comment on HDFS-12935 at 1/17/18 12:31 PM:


Thanks [~brahmareddy] to point out. I will updated the patch to handle 
{{listOpenFiles}}. 


was (Author: jiangjianfei):
Thanks [~brahmareddy] to point out. I have updated the patch to handle 
\{{listOpenFiles}}. Please review if available.

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 2.9.0, 3.0.0-beta1, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
>Priority: Major
> Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, 
> HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS-12935.006-branch.2.patch, 
> HDFS-12935.006.patch, HDFS-12935.007-branch.2.patch, HDFS-12935.007.patch, 
> HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2018-01-08 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16317933#comment-16317933
 ] 

Jianfei Jiang edited comment on HDFS-12935 at 1/9/18 7:49 AM:
--

Thanks [~brahmareddy] for your suggestions.
*  {{isAtLeastOneActive}} was added just for error hint and not necessary 
functionally. I will remove it.
*  Fail info will be added to sysout.
*  {{checkOperation(OperationCategory.READ);}} changed to 
{{checkOperation(OperationCategory.WRITE)}}
*  {{metasave}} is fixed. I think {{listOpenFiles }} is not needed to be fixed 
as the code following. Not all namenodes are called in {{listOpenFiles }} like 
other commands.
{code:java}
if (isHaEnabled) {
  ProxyAndInfo proxy = NameNodeProxies.createNonHAProxy(
  dfsConf, HAUtil.getAddressOfActive(getDFS()), ClientProtocol.class,
  UserGroupInformation.getCurrentUser(), false);
  openFilesRemoteIterator = new OpenFilesIterator(proxy.getProxy(),
  FsTracer.get(dfsConf));
} 
{code}
* Patch for {{branch-2}} is will be made. Jira for testcase improvement will be 
created as well.

Please review. Thanks


was (Author: jiangjianfei):
Thanks [~brahmareddy] for your suggestions.
*  {{isAtLeastOneActive}} was added just for error hint and not necessary 
functionally. I will remove it.
*  Fail info will be added to sysout.
*  {{checkOperation(OperationCategory.READ);}} changed to 
{{checkOperation(OperationCategory.WRITE)}}
*  {{metasave}} is fixed. I think {{listOpenFiles }} is not needed to be fixed 
as the code following. Not all namenodes are called in {{listOpenFiles }} like 
other commands.
{code:java}
if (isHaEnabled) {
  ProxyAndInfo proxy = NameNodeProxies.createNonHAProxy(
  dfsConf, HAUtil.getAddressOfActive(getDFS()), ClientProtocol.class,
  UserGroupInformation.getCurrentUser(), false);
  openFilesRemoteIterator = new OpenFilesIterator(proxy.getProxy(),
  FsTracer.get(dfsConf));
} 
{code}
* Patch for {{branch-2}} is will be made.

Please review. Thanks

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 3.0.0-beta1, 3.0.0
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
> Attachments: HDFS-12935.002.patch, HDFS-12935.003.patch, 
> HDFS-12935.004.patch, HDFS-12935.005.patch, HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2017-12-23 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302178#comment-16302178
 ] 

Jianfei Jiang edited comment on HDFS-12935 at 12/23/17 9:58 AM:


The failed testcases are not related. I have re-run them and all passed.
[~brahmareddy]  Please review. Thanks a lot.

[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s 
- in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s 
- in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 
s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s 
- in org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 
s - in org.apache.hadoop.hdfs.TestEncryptionZones
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4



was (Author: jiangjianfei):
The failed testcases are not related. I have re-run them and all passed.
[~brahmareddy], [~shahrs87], [~zhenyi] Please review. Thanks a lot.

[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s 
- in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s 
- in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 
s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s 
- in org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 
s - in org.apache.hadoop.hdfs.TestEncryptionZones
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4


> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 3.0.0-beta1, 3.0.0
>  

[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2017-12-22 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302178#comment-16302178
 ] 

Jianfei Jiang edited comment on HDFS-12935 at 12/23/17 3:46 AM:


The failed testcases are not related. I have re-run them and all passed.
[~brahmareddy], [~shahrs87], [~zhenyi] Please review. Thanks a lot.

[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s 
- in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s 
- in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 
s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s 
- in org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 
s - in org.apache.hadoop.hdfs.TestEncryptionZones
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4



was (Author: jiangjianfei):
The failed testcases are not related. I have re-run them and all passed.
Please review [~brahmareddy] [~shahrs87] [~zhenyi]. Thanks a lot.

[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s 
- in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s 
- in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 
s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s 
- in org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 
s - in org.apache.hadoop.hdfs.TestEncryptionZones
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4


> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 

[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2017-12-22 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16302178#comment-16302178
 ] 

Jianfei Jiang edited comment on HDFS-12935 at 12/23/17 3:45 AM:


The failed testcases are not related. I have re-run them and all passed.
Please review [~brahmareddy] [~shahrs87] [~zhenyi]. Thanks a lot.

[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s 
- in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s 
- in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 
s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s 
- in org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 
s - in org.apache.hadoop.hdfs.TestEncryptionZones
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4



was (Author: jiangjianfei):
The failed testcases are not related. I have re-run them and all passed.
Please review [~brahmareddy]. Thanks a lot.

[INFO] ---
[INFO]  T E S T S
[INFO] ---
[INFO] Running org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Tests run: 10, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 44.779 
s - in org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailure
[INFO] Running org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.455 s 
- in org.apache.hadoop.hdfs.server.namenode.TestDecommissioningStatus
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Tests run: 9, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 11.318 s 
- in org.apache.hadoop.hdfs.server.namenode.ha.TestDFSUpgradeWithHA
[INFO] Running org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Tests run: 22, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 113.323 
s - in org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.711 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure090
[INFO] Running org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[WARNING] Tests run: 18, Failures: 0, Errors: 0, Skipped: 2, Time elapsed: 
110.061 s - in org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure030
[INFO] Running org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Tests run: 4, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 45.687 s 
- in org.apache.hadoop.hdfs.TestRenameWhileOpen
[INFO] Running org.apache.hadoop.hdfs.TestEncryptionZones
[INFO] Tests run: 33, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 59.161 
s - in org.apache.hadoop.hdfs.TestEncryptionZones
[INFO]
[INFO] Results:
[INFO]
[WARNING] Tests run: 118, Failures: 0, Errors: 0, Skipped: 4


> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 3.0.0-beta1, 3.0.0
>

[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2017-12-21 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299786#comment-16299786
 ] 

Jianfei Jiang edited comment on HDFS-12935 at 12/21/17 9:57 AM:


Add a patch to fix up 10 commands



was (Author: jiangjianfei):
fix up the following commands: 
setSafeMode
saveNamespace
restoreFailedStorage
refreshNodes
setBalancerBandwidth
finalizeUpgrade
refreshServiceAcl
refreshUserToGroupsMappings
refreshSuperUserGroupsConfiguration
refreshCallQueue

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 3.0.0-beta1
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
> Attachments: HDFS_12935.001.patch, HDFS_12935.002.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Comment Edited] (HDFS-12935) Get ambiguous result for DFSAdmin command in HA mode when only one namenode is up

2017-12-20 Thread Jianfei Jiang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-12935?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16299596#comment-16299596
 ] 

Jianfei Jiang edited comment on HDFS-12935 at 12/21/17 5:39 AM:


Thanks  [~brahmareddy]  so much for your comment and adding me as a 
contributor. It's my honor.  I have read HowToContribute and learnt a lot.
After viewing the relative code. The following commands may have the some issue 
and I am going to handle.

setSafeMode
saveNamespace
restoreFailedStorage
refreshNodes
setBalancerBandwidth
finalizeUpgrade
refreshServiceAcl
refreshUserToGroupsMappings
refreshSuperUserGroupsConfiguration
refreshCallQueue


was (Author: jiangjianfei):
Thanks  [~brahmareddy]  so much for your comment and adding me as a 
contributor. It is my pleasure.  I have read HowToContribute and learnt a lot.
After viewing the relative code. The following commands may have the some issue 
and I am going to handle.

setSafeMode
saveNamespace
restoreFailedStorage
refreshNodes
setBalancerBandwidth
finalizeUpgrade
refreshServiceAcl
refreshUserToGroupsMappings
refreshSuperUserGroupsConfiguration
refreshCallQueue

> Get ambiguous result for DFSAdmin command in HA mode when only one namenode 
> is up
> -
>
> Key: HDFS-12935
> URL: https://issues.apache.org/jira/browse/HDFS-12935
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: tools
>Affects Versions: 3.0.0-beta1
>Reporter: Jianfei Jiang
>Assignee: Jianfei Jiang
> Attachments: HDFS_12935.001.patch
>
>
> In HA mode, if one namenode is down, most of functions can still work. When 
> considering the following two occasions:
>  (1)nn1 up and nn2 down
>  (2)nn1 down and nn2 up
> These two occasions should be equivalent. However, some of the DFSAdmin 
> commands will have ambiguous results. The commands can be send successfully 
> to the up namenode and are always functionally useful only when nn1 is up 
> regardless of exception (IOException when connecting to the down namenode 
> nn2). If only nn2 is up, the commands have no use at all and only exception 
> to connect nn1 can be found.
> See the following command "hdfs dfsadmin setBalancerBandwidth" which aim to 
> set balancer bandwidth value for datanodes as an example. It works and all 
> the datanodes can get the setting values only when nn1 is up. If only nn2 is 
> up, the command throws exception directly and no datanode get the bandwidth 
> setting. Approximately ten DFSAdmin commands use the similar logical process 
> and may be ambiguous.
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn1
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 12345
> *Balancer bandwidth is set to 12345 for jiangjianfei01/172.17.0.14:9820*
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei02:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# hdfs haadmin -getServiceState nn2
> active
> [root@jiangjianfei01 ~]# hdfs dfsadmin -setBalancerBandwidth 1234
> setBalancerBandwidth: Call From jiangjianfei01/172.17.0.14 to 
> jiangjianfei01:9820 failed on connection exception: 
> java.net.ConnectException: Connection refused; For more details see:  
> http://wiki.apache.org/hadoop/ConnectionRefused
> [root@jiangjianfei01 ~]# 



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org