[
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14955135#comment-14955135
]
Hudson commented on HDFS-8676:
------------------------------
FAILURE: Integrated in Hadoop-Mapreduce-trunk-Java8 #520 (See
[https://builds.apache.org/job/Hadoop-Mapreduce-trunk-Java8/520/])
HDFS-8676. Delayed rolling upgrade finalization can cause heartbeat (kihwal:
rev 5b43db47a313decccdcca8f45c5708aab46396df)
*
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/BlockPoolSliceStorage.java
* hadoop-hdfs-project/hadoop-hdfs/CHANGES.txt
*
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/datanode/DataStorage.java
> Delayed rolling upgrade finalization can cause heartbeat expiration
> -------------------------------------------------------------------
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Assignee: Walter Su
> Priority: Critical
> Attachments: HDFS-8676.01.patch, HDFS-8676.02.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks
> can pile up in the datanode trash directories until an upgrade is finalized.
> When it is finally finalized, the deletion of trash is done in the service
> actor thread's context synchronously. This blocks the heartbeat and can
> cause heartbeat expiration.
> We have seen a namenode losing hundreds of nodes after a delayed upgrade
> finalization. The deletion of trash directories should be made asynchronous.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)