[
https://issues.apache.org/jira/browse/HDFS-8676?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14940045#comment-14940045
]
Walter Su commented on HDFS-8676:
---------------------------------
Thanks Lee for reviewing. I'm on holiday. I'll update it soon.
> Delayed rolling upgrade finalization can cause heartbeat expiration
> -------------------------------------------------------------------
>
> Key: HDFS-8676
> URL: https://issues.apache.org/jira/browse/HDFS-8676
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Assignee: Walter Su
> Priority: Critical
> Attachments: HDFS-8676.01.patch
>
>
> In big busy clusters where the deletion rate is also high, a lot of blocks
> can pile up in the datanode trash directories until an upgrade is finalized.
> When it is finally finalized, the deletion of trash is done in the service
> actor thread's context synchronously. This blocks the heartbeat and can
> cause heartbeat expiration.
> We have seen a namenode losing hundreds of nodes after a delayed upgrade
> finalization. The deletion of trash directories should be made asynchronous.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)