[
https://issues.apache.org/jira/browse/HDFS-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13230357#comment-13230357
]
Suresh Srinivas commented on HDFS-3087:
---------------------------------------
Kihwal, this is a good bug find. We should fix this.
This problem is not that serious. Prior to 0.23, we shutdown the datanode post
decommission completed. After HDFS-1547 we do not shutdown the DN any more. The
DN continues to shown as decommissioned. The expectation is, an Admin can at a
later time shutdown the decommissioned DNs and proceed with maintenance of the
node. Given this the question is, after we mark DN as decommissioned, when
block report comes in, what happens? I suspect we moving back to decom in
progress.
How about using the flag that DatanodeDescriptor has for tracking first block
report. We should not mark a DN as decommissioned, if block report is not
received. I also agree that we should not be marking any thing as
decommissioned, until we come out of safemode.
> Decomissioning on NN restart can complete without blocks being replicated
> -------------------------------------------------------------------------
>
> Key: HDFS-3087
> URL: https://issues.apache.org/jira/browse/HDFS-3087
> Project: Hadoop HDFS
> Issue Type: Bug
> Components: name-node
> Affects Versions: 0.23.0, 0.24.0
> Reporter: Kihwal Lee
> Assignee: Kihwal Lee
> Priority: Critical
> Fix For: 0.23.0, 0.24.0, 0.23.2, 0.23.3
>
>
> If a data node is added to the exclude list and the name node is restarted,
> the decomissioning happens right away on the data node registration. At this
> point the initial block report has not been sent, so the name node thinks the
> node has zero blocks and the decomissioning completes very quick, without
> replicating the blocks on that node.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators:
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira