[
https://issues.apache.org/jira/browse/HDFS-8995?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14726562#comment-14726562
]
Hudson commented on HDFS-8995:
------------------------------
FAILURE: Integrated in Hadoop-Yarn-trunk-Java8 #339 (See
[https://builds.apache.org/job/Hadoop-Yarn-trunk-Java8/339/])
HDFS-8995. Flaw in registration bookeeping can make DN die on reconnect.
(Kihwal Lee via yliu) (yliu: rev 5652131d2ea68c408dd3cd8bee31723642a8cdde)
*
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataNode.java
*
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPOfferService.java
*
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BPServiceActor.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
> Flaw in registration bookeeping can make DN die on reconnect
> ------------------------------------------------------------
>
> Key: HDFS-8995
> URL: https://issues.apache.org/jira/browse/HDFS-8995
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Critical
> Fix For: 2.7.2
>
> Attachments: HDFS-8995.patch
>
>
> Normally data nodes re-register with the namenode when it was unreachable for
> more than the heartbeat expiration and becomes reachable again. Datanodes
> keep retrying the last rpc call such as incremental block report and
> heartbeat and when it finally gets through the namenode tells it to
> re-register.
> We have observed that some of datanodes stay dead in such scenarios. Further
> investigation has revealed that those were told to shutdown by the namenode.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)