[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2018-05-04 Thread Wei-Chiu Chuang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16464177#comment-16464177
 ] 

Wei-Chiu Chuang commented on HDFS-7527:
---

Hmm someone this test always time out before and after the patch.
How come it passed Hadoop precommit?

> TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
> ---
>
> Key: HDFS-7527
> URL: https://issues.apache.org/jira/browse/HDFS-7527
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, test
>Reporter: Yongjun Zhang
>Assignee: Binglin Chang
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-7527.001.patch, HDFS-7527.002.patch, 
> HDFS-7527.003.patch
>
>
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
> {quote}
> Error Message
> test timed out after 36 milliseconds
> Stacktrace
> java.lang.Exception: test timed out after 36 milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
> 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
> (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
> BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
> localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
> the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
> datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
> infoSecurePort=0, ipcPort=43726, 
> storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
>   at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
> 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
> (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
> BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
> localhost/127.0.0.1:40565. Exiting. 
> java.io.IOException: DN shut down before block pool connected
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
>   at java.lang.Thread.run(Thread.java:745)
> {quote}
> Found by tool proposed in HADOOP-11045:
> {quote}
> [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
> Hadoop-Hdfs-trunk -n 5 | tee bt.log
> Recently FAILED builds in url: 
> https://builds.apache.org//job/Hadoop-Hdfs-trunk
> THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
> as listed below:
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
> (2014-12-15 03:30:01)
> Failed test: 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
> Failed test: 
> org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport 
> (2014-12-13 10:32:27)
> Failed test: 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/testReport 
> (2014-12-13 03:30:01)
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
> 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2018-03-12 Thread genericqa (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396364#comment-16396364
 ] 

genericqa commented on HDFS-7527:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
31s{color} | {color:blue} Docker mode activated. {color} |
|| || || || {color:brown} Prechecks {color} ||
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
|| || || || {color:brown} trunk Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 20m 
12s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
55s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 57s{color} | {color:green} branch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
11s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
6s{color} | {color:green} trunk passed {color} |
|| || || || {color:brown} Patch Compile Tests {color} ||
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
13s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
8s{color} | {color:green} the patch passed {color} |
| {color:orange}-0{color} | {color:orange} checkstyle {color} | {color:orange}  
0m 54s{color} | {color:orange} hadoop-hdfs-project/hadoop-hdfs: The patch 
generated 1 new + 21 unchanged - 0 fixed = 22 total (was 21) {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m 
19s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} shadedclient {color} | {color:green} 
12m 18s{color} | {color:green} patch has no errors when building and testing 
our client artifacts. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m 
21s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  1m  
1s{color} | {color:green} the patch passed {color} |
|| || || || {color:brown} Other Tests {color} ||
| {color:red}-1{color} | {color:red} unit {color} | {color:red}116m 17s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
24s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black}176m 34s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting |
\\
\\
|| Subsystem || Report/Notes ||
| Docker | Client=17.05.0-ce Server=17.05.0-ce Image:yetus/hadoop:d4cc50f |
| JIRA Issue | HDFS-7527 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12914172/HDFS-7527.003.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  shadedclient  findbugs  checkstyle  |
| uname | Linux db9fd30c0c31 3.13.0-135-generic #184-Ubuntu SMP Wed Oct 18 
11:55:51 UTC 2017 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/patchprocess/precommit/personality/provided.sh |
| git revision | trunk / 39a5fba |
| maven | version: Apache Maven 3.3.9 |
| Default Java | 1.8.0_151 |
| findbugs | v3.1.0-RC1 |
| checkstyle | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23429/artifact/out/diff-checkstyle-hadoop-hdfs-project_hadoop-hdfs.txt
 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23429/artifact/out/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/23429/testReport/ |
| Max. process+thread count | 3076 (vs. ulimit of 1) |
| 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2018-03-12 Thread Ajay Kumar (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16396197#comment-16396197
 ] 

Ajay Kumar commented on HDFS-7527:
--

Patch v3 rebased with current trunk. Removed {{HostFileManager}} change as it 
is already included and reduced datanode heartbeat time to 250ms.

> TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
> ---
>
> Key: HDFS-7527
> URL: https://issues.apache.org/jira/browse/HDFS-7527
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, test
>Reporter: Yongjun Zhang
>Assignee: Binglin Chang
>Priority: Major
>  Labels: flaky-test
> Attachments: HDFS-7527.001.patch, HDFS-7527.002.patch, 
> HDFS-7527.003.patch
>
>
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
> {quote}
> Error Message
> test timed out after 36 milliseconds
> Stacktrace
> java.lang.Exception: test timed out after 36 milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
> 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
> (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
> BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
> localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
> the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
> datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
> infoSecurePort=0, ipcPort=43726, 
> storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
>   at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
> 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
> (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
> BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
> localhost/127.0.0.1:40565. Exiting. 
> java.io.IOException: DN shut down before block pool connected
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
>   at java.lang.Thread.run(Thread.java:745)
> {quote}
> Found by tool proposed in HADOOP-11045:
> {quote}
> [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
> Hadoop-Hdfs-trunk -n 5 | tee bt.log
> Recently FAILED builds in url: 
> https://builds.apache.org//job/Hadoop-Hdfs-trunk
> THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
> as listed below:
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
> (2014-12-15 03:30:01)
> Failed test: 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
> Failed test: 
> org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport 
> (2014-12-13 10:32:27)
> Failed test: 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/testReport 
> (2014-12-13 03:30:01)
> Failed test: 
> org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
> 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2017-09-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=16181807#comment-16181807
 ] 

Hadoop QA commented on HDFS-7527:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  6s{color} 
| {color:red} HDFS-7527 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-7527 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12687971/HDFS-7527.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/21373/console |
| Powered by | Apache Yetus 0.6.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
> ---
>
> Key: HDFS-7527
> URL: https://issues.apache.org/jira/browse/HDFS-7527
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, test
>Reporter: Yongjun Zhang
>Assignee: Binglin Chang
>  Labels: flaky-test
> Attachments: HDFS-7527.001.patch, HDFS-7527.002.patch
>
>
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
> {quote}
> Error Message
> test timed out after 36 milliseconds
> Stacktrace
> java.lang.Exception: test timed out after 36 milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
> 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
> (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
> BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
> localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
> the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
> datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
> infoSecurePort=0, ipcPort=43726, 
> storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
>   at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
> 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
> (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
> BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
> localhost/127.0.0.1:40565. Exiting. 
> java.io.IOException: DN shut down before block pool connected
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
>   at java.lang.Thread.run(Thread.java:745)
> {quote}
> Found by tool proposed in HADOOP-11045:
> {quote}
> [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
> Hadoop-Hdfs-trunk -n 5 | tee bt.log
> Recently FAILED builds in url: 
> https://builds.apache.org//job/Hadoop-Hdfs-trunk
> THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
> as listed below:
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
> (2014-12-15 03:30:01)
> Failed test: 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
> Failed test: 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2017-01-06 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15806018#comment-15806018
 ] 

Hadoop QA commented on HDFS-7527:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  4s{color} 
| {color:red} HDFS-7527 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Issue | HDFS-7527 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12687971/HDFS-7527.002.patch |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/18084/console |
| Powered by | Apache Yetus 0.5.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
> ---
>
> Key: HDFS-7527
> URL: https://issues.apache.org/jira/browse/HDFS-7527
> Project: Hadoop HDFS
>  Issue Type: Bug
>  Components: namenode, test
>Reporter: Yongjun Zhang
>Assignee: Binglin Chang
> Attachments: HDFS-7527.001.patch, HDFS-7527.002.patch
>
>
> https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
> {quote}
> Error Message
> test timed out after 36 milliseconds
> Stacktrace
> java.lang.Exception: test timed out after 36 milliseconds
>   at java.lang.Thread.sleep(Native Method)
>   at 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
> 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
> (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
> BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
> localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
> the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
> datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
> infoSecurePort=0, ipcPort=43726, 
> storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
>   at 
> org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
>   at 
> org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
>   at 
> org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
>   at 
> org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
>   at 
> org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
>   at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
>   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
>   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
>   at java.security.AccessController.doPrivileged(Native Method)
>   at javax.security.auth.Subject.doAs(Subject.java:415)
>   at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
>   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
> 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
> (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
> BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
> localhost/127.0.0.1:40565. Exiting. 
> java.io.IOException: DN shut down before block pool connected
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
>   at 
> org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
>   at java.lang.Thread.run(Thread.java:745)
> {quote}
> Found by tool proposed in HADOOP-11045:
> {quote}
> [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
> Hadoop-Hdfs-trunk -n 5 | tee bt.log
> Recently FAILED builds in url: 
> https://builds.apache.org//job/Hadoop-Hdfs-trunk
> THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
> as listed below:
> ===>https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
> (2014-12-15 03:30:01)
> Failed test: 
> org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
> Failed test: 
> 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2014-12-18 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251415#comment-14251415
 ] 

Hadoop QA commented on HDFS-7527:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12687971/HDFS-7527.002.patch
  against trunk revision 1050d42.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The patch failed these unit tests in 
hadoop-hdfs-project/hadoop-hdfs:

  
org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureToleration

  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9070//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9070//console

This message is automatically generated.

 TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
 ---

 Key: HDFS-7527
 URL: https://issues.apache.org/jira/browse/HDFS-7527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Reporter: Yongjun Zhang
Assignee: Binglin Chang
 Attachments: HDFS-7527.001.patch, HDFS-7527.002.patch


 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
 {quote}
 Error Message
 test timed out after 36 milliseconds
 Stacktrace
 java.lang.Exception: test timed out after 36 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
 (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
 the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
 datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
 infoSecurePort=0, ipcPort=43726, 
 storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
   at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
 (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565. Exiting. 
 java.io.IOException: DN shut down before block pool connected
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
   at 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2014-12-18 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14252501#comment-14252501
 ] 

Colin Patrick McCabe commented on HDFS-7527:


Thanks for looking at this, [~decster] and [~wheat9].  It's a difficult and 
frustrating area of the code, in my opinion.

Unfortunately, I don't think this latest patch is exactly what we need.  Last 
time we proposed adding more DNS lookups in the {{DatanodeManager}}, the Yahoo 
guys said this was unacceptable from a performance point of view.  Caching DNS 
lookups, so that we didn't have to do them all the time, is a big part of what 
the {{HostFileManager}} was created to do.  [~daryn], [~eli], do you have any 
ideas here?

 TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
 ---

 Key: HDFS-7527
 URL: https://issues.apache.org/jira/browse/HDFS-7527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Reporter: Yongjun Zhang
Assignee: Binglin Chang
 Attachments: HDFS-7527.001.patch, HDFS-7527.002.patch


 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
 {quote}
 Error Message
 test timed out after 36 milliseconds
 Stacktrace
 java.lang.Exception: test timed out after 36 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
 (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
 the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
 datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
 infoSecurePort=0, ipcPort=43726, 
 storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
   at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
 (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565. Exiting. 
 java.io.IOException: DN shut down before block pool connected
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
   at java.lang.Thread.run(Thread.java:745)
 {quote}
 Found by tool proposed in HADOOP-11045:
 {quote}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 Hadoop-Hdfs-trunk -n 5 | tee bt.log
 Recently FAILED builds in url: 
 https://builds.apache.org//job/Hadoop-Hdfs-trunk
 THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
 as listed below:
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
 (2014-12-15 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
 Failed test: 
 org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
 Failed test: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport 
 (2014-12-13 10:32:27)
 Failed test: 
 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2014-12-17 Thread Colin Patrick McCabe (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14250767#comment-14250767
 ] 

Colin Patrick McCabe commented on HDFS-7527:


I am -1 for removing this test right now, until we understand this issue 
better.  Putting registration names in the host include and exclude files 
used to work.  If it stopped working, then that's a bug that we should fix.  
Or, alternately, we should have a JIRA to remove registration names entirely.  
Last time we proposed that, it got rejected, though.  See HDFS-5237.

One example of where you might want to set registration names is if you're on 
an AWS instance with internal and external IP interfaces.  On each datanode, 
you would set {{dfs.datanode.hostname}} to the internal IP address to ensure 
that traffic flowed over the internal interface, rather than the (expensive) 
external interfaces.  In this case, you should be able to specify what nodes 
are in the cluster using these same registration names, even if doing reverse 
DNS on the datanode hostnames returns another IP address as the first entry.

 TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
 ---

 Key: HDFS-7527
 URL: https://issues.apache.org/jira/browse/HDFS-7527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Reporter: Yongjun Zhang
Assignee: Binglin Chang
 Attachments: HDFS-7527.001.patch


 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
 {quote}
 Error Message
 test timed out after 36 milliseconds
 Stacktrace
 java.lang.Exception: test timed out after 36 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
 (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
 the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
 datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
 infoSecurePort=0, ipcPort=43726, 
 storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
   at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
 (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565. Exiting. 
 java.io.IOException: DN shut down before block pool connected
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
   at java.lang.Thread.run(Thread.java:745)
 {quote}
 Found by tool proposed in HADOOP-11045:
 {quote}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 Hadoop-Hdfs-trunk -n 5 | tee bt.log
 Recently FAILED builds in url: 
 https://builds.apache.org//job/Hadoop-Hdfs-trunk
 THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
 as listed below:
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
 (2014-12-15 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
 Failed 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2014-12-17 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14251244#comment-14251244
 ] 

Binglin Chang commented on HDFS-7527:
-

Make sense, looks like the behavior is changed at some point. 
Update the patch to partially support dfs.datanode.hostname(if it is an ip 
address, or the hostname resolve to a proper ip address). 
And add change to test to properly wait for the excluded datanode become back 
again(using Datanode.isDatanodeFullyStarted rather than checking ALIVE node 
count).
Note that too fully restore the old behavior requires a lot more changes, 
currently I only made minimal changes.

 TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
 ---

 Key: HDFS-7527
 URL: https://issues.apache.org/jira/browse/HDFS-7527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Reporter: Yongjun Zhang
Assignee: Binglin Chang
 Attachments: HDFS-7527.001.patch


 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
 {quote}
 Error Message
 test timed out after 36 milliseconds
 Stacktrace
 java.lang.Exception: test timed out after 36 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
 (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
 the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
 datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
 infoSecurePort=0, ipcPort=43726, 
 storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
   at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
 (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565. Exiting. 
 java.io.IOException: DN shut down before block pool connected
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
   at java.lang.Thread.run(Thread.java:745)
 {quote}
 Found by tool proposed in HADOOP-11045:
 {quote}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 Hadoop-Hdfs-trunk -n 5 | tee bt.log
 Recently FAILED builds in url: 
 https://builds.apache.org//job/Hadoop-Hdfs-trunk
 THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
 as listed below:
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
 (2014-12-15 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
 Failed test: 
 org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
 Failed test: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport 
 (2014-12-13 10:32:27)
 Failed test: 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/testReport 
 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2014-12-16 Thread Binglin Chang (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248411#comment-14248411
 ] 

Binglin Chang commented on HDFS-7527:
-

Read some related code, the test is intended to test dfs.host list can support  
dfs.datanode.hostname (e.g. if you set adatanode's name to host1, and dfs.host 
file contains host1, this datanode should be able to connect to namenode).
But after reading to code, turns out DatanodeManager check dfs.host list only 
using ip address, not hostname(namenode resolve all hostnames in dfs.host file 
to ip address), so this test should fail as the expect behavior. 
The reason the test passes most of the time is because the code is missing 
proper waiting to make sure the old datanode is expired.

{code}
refreshNodes(cluster.getNamesystem(0), hdfsConf);
cluster.restartDataNode(0);
   
// there should be some wait time before the original datanode becoming 
dead, 
// or the following checking code will always success, because old datanode 
is still alive

// Wait for the DN to come back.
while (true) {
  DatanodeInfo info[] = client.datanodeReport(DatanodeReportType.LIVE);
  if (info.length == 1) {
Assert.assertFalse(info[0].isDecommissioned());
Assert.assertFalse(info[0].isDecommissionInProgress());
assertEquals(registrationName, info[0].getHostName());
break;
  }
  LOG.info(Waiting for datanode to come back);
  Thread.sleep(HEARTBEAT_INTERVAL * 1000);
}
{code}

I added some sleep time in the comment above, and the test always fail, which 
verify my theory.
Since the test is not valid, I think we should just remove it.
 



 TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
 ---

 Key: HDFS-7527
 URL: https://issues.apache.org/jira/browse/HDFS-7527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Reporter: Yongjun Zhang
Assignee: Binglin Chang

 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
 {quote}
 Error Message
 test timed out after 36 milliseconds
 Stacktrace
 java.lang.Exception: test timed out after 36 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
 (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
 the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
 datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
 infoSecurePort=0, ipcPort=43726, 
 storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
   at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
 (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565. Exiting. 
 java.io.IOException: DN shut down before block pool connected
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
   at java.lang.Thread.run(Thread.java:745)
 {quote}
 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2014-12-16 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14248784#comment-14248784
 ] 

Hadoop QA commented on HDFS-7527:
-

{color:red}-1 overall{color}.  Here are the results of testing the latest 
attachment 
  http://issues.apache.org/jira/secure/attachment/12687508/HDFS-7527.001.patch
  against trunk revision 07bb0b0.

{color:green}+1 @author{color}.  The patch does not contain any @author 
tags.

{color:green}+1 tests included{color}.  The patch appears to include 1 new 
or modified test files.

{color:green}+1 javac{color}.  The applied patch does not increase the 
total number of javac compiler warnings.

{color:green}+1 javadoc{color}.  There were no new javadoc warning messages.

{color:green}+1 eclipse:eclipse{color}.  The patch built with 
eclipse:eclipse.

{color:green}+1 findbugs{color}.  The patch does not introduce any new 
Findbugs (version 2.0.3) warnings.

{color:green}+1 release audit{color}.  The applied patch does not increase 
the total number of release audit warnings.

{color:red}-1 core tests{color}.  The following test timeouts occurred in 
hadoop-hdfs-project/hadoop-hdfs:

org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager

Test results: 
https://builds.apache.org/job/PreCommit-HDFS-Build/9052//testReport/
Console output: https://builds.apache.org/job/PreCommit-HDFS-Build/9052//console

This message is automatically generated.

 TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
 ---

 Key: HDFS-7527
 URL: https://issues.apache.org/jira/browse/HDFS-7527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Reporter: Yongjun Zhang
Assignee: Binglin Chang
 Attachments: HDFS-7527.001.patch


 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
 {quote}
 Error Message
 test timed out after 36 milliseconds
 Stacktrace
 java.lang.Exception: test timed out after 36 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
 (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
 the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
 datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
 infoSecurePort=0, ipcPort=43726, 
 storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
   at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
 (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565. Exiting. 
 java.io.IOException: DN shut down before block pool connected
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
   at java.lang.Thread.run(Thread.java:745)
 {quote}
 Found by tool proposed in HADOOP-11045:
 {quote}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 Hadoop-Hdfs-trunk -n 5 | tee bt.log
 Recently FAILED 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2014-12-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249005#comment-14249005
 ] 

Haohui Mai commented on HDFS-7527:
--

+1

 TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
 ---

 Key: HDFS-7527
 URL: https://issues.apache.org/jira/browse/HDFS-7527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Reporter: Yongjun Zhang
Assignee: Binglin Chang
 Attachments: HDFS-7527.001.patch


 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
 {quote}
 Error Message
 test timed out after 36 milliseconds
 Stacktrace
 java.lang.Exception: test timed out after 36 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
 (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
 the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
 datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
 infoSecurePort=0, ipcPort=43726, 
 storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
   at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
 (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565. Exiting. 
 java.io.IOException: DN shut down before block pool connected
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
   at java.lang.Thread.run(Thread.java:745)
 {quote}
 Found by tool proposed in HADOOP-11045:
 {quote}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 Hadoop-Hdfs-trunk -n 5 | tee bt.log
 Recently FAILED builds in url: 
 https://builds.apache.org//job/Hadoop-Hdfs-trunk
 THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
 as listed below:
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
 (2014-12-15 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
 Failed test: 
 org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
 Failed test: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport 
 (2014-12-13 10:32:27)
 Failed test: 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/testReport 
 (2014-12-13 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1969/testReport 
 (2014-12-11 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
 Failed test: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
 Failed test: 
 

[jira] [Commented] (HDFS-7527) TestDecommission.testIncludeByRegistrationName fails occassionally in trunk

2014-12-16 Thread Haohui Mai (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-7527?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=14249014#comment-14249014
 ] 

Haohui Mai commented on HDFS-7527:
--

It loos like the test is flaky, [~cmccabe], do you have any comments about 
this, or do you have any idea how we can improve the test?

 TestDecommission.testIncludeByRegistrationName fails occassionally in trunk
 ---

 Key: HDFS-7527
 URL: https://issues.apache.org/jira/browse/HDFS-7527
 Project: Hadoop HDFS
  Issue Type: Bug
  Components: namenode, test
Reporter: Yongjun Zhang
Assignee: Binglin Chang
 Attachments: HDFS-7527.001.patch


 https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport/
 {quote}
 Error Message
 test timed out after 36 milliseconds
 Stacktrace
 java.lang.Exception: test timed out after 36 milliseconds
   at java.lang.Thread.sleep(Native Method)
   at 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName(TestDecommission.java:957)
 2014-12-15 12:00:19,958 ERROR datanode.DataNode 
 (BPServiceActor.java:run(836)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565 Datanode denied communication with namenode because 
 the host is not in the include-list: DatanodeRegistration(127.0.0.1, 
 datanodeUuid=55d8cbff-d8a3-4d6d-ab64-317fff0ee279, infoPort=54318, 
 infoSecurePort=0, ipcPort=43726, 
 storageInfo=lv=-56;cid=testClusterID;nsid=903754315;c=0)
   at 
 org.apache.hadoop.hdfs.server.blockmanagement.DatanodeManager.registerDatanode(DatanodeManager.java:915)
   at 
 org.apache.hadoop.hdfs.server.namenode.FSNamesystem.registerDatanode(FSNamesystem.java:4402)
   at 
 org.apache.hadoop.hdfs.server.namenode.NameNodeRpcServer.registerDatanode(NameNodeRpcServer.java:1196)
   at 
 org.apache.hadoop.hdfs.protocolPB.DatanodeProtocolServerSideTranslatorPB.registerDatanode(DatanodeProtocolServerSideTranslatorPB.java:92)
   at 
 org.apache.hadoop.hdfs.protocol.proto.DatanodeProtocolProtos$DatanodeProtocolService$2.callBlockingMethod(DatanodeProtocolProtos.java:26296)
   at 
 org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:637)
   at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:966)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2127)
   at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2123)
   at java.security.AccessController.doPrivileged(Native Method)
   at javax.security.auth.Subject.doAs(Subject.java:415)
   at 
 org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1669)
   at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2121)
 2014-12-15 12:00:29,087 FATAL datanode.DataNode 
 (BPServiceActor.java:run(841)) - Initialization failed for Block pool 
 BP-887397778-67.195.81.153-1418644469024 (Datanode Uuid null) service to 
 localhost/127.0.0.1:40565. Exiting. 
 java.io.IOException: DN shut down before block pool connected
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.retrieveNamespaceInfo(BPServiceActor.java:186)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.connectToNNAndHandshake(BPServiceActor.java:216)
   at 
 org.apache.hadoop.hdfs.server.datanode.BPServiceActor.run(BPServiceActor.java:829)
   at java.lang.Thread.run(Thread.java:745)
 {quote}
 Found by tool proposed in HADOOP-11045:
 {quote}
 [yzhang@localhost jenkinsftf]$ ./determine-flaky-tests-hadoop.py -j 
 Hadoop-Hdfs-trunk -n 5 | tee bt.log
 Recently FAILED builds in url: 
 https://builds.apache.org//job/Hadoop-Hdfs-trunk
 THERE ARE 4 builds (out of 6) that have failed tests in the past 5 days, 
 as listed below:
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1974/testReport 
 (2014-12-15 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
 Failed test: 
 org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
 Failed test: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1972/testReport 
 (2014-12-13 10:32:27)
 Failed test: 
 org.apache.hadoop.hdfs.TestDecommission.testIncludeByRegistrationName
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1971/testReport 
 (2014-12-13 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.server.namenode.ha.TestRetryCacheWithHA.testUpdatePipeline
 ===https://builds.apache.org/job/Hadoop-Hdfs-trunk/1969/testReport 
 (2014-12-11 03:30:01)
 Failed test: 
 org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager.testNumVersionsReportedCorrect
 Failed test: