[ 
https://issues.apache.org/jira/browse/HDFS-3087?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13229340#comment-13229340
 ] 

Kihwal Lee commented on HDFS-3087:
----------------------------------

I've talked to some of admins and they are saying they never do this. Even when 
the NN crashes in the middle of decommissioning nodes, they remove the nodes 
from the exclude list and restart the NN. So it may not be much of an issue for 
experienced people. 

Whether it is critical or not, it's still a bug.
                
> Decomissioning on NN restart can complete without blocks being replicated
> -------------------------------------------------------------------------
>
>                 Key: HDFS-3087
>                 URL: https://issues.apache.org/jira/browse/HDFS-3087
>             Project: Hadoop HDFS
>          Issue Type: Bug
>          Components: name-node
>    Affects Versions: 0.23.0, 0.24.0
>            Reporter: Kihwal Lee
>            Assignee: Kihwal Lee
>            Priority: Critical
>             Fix For: 0.23.0, 0.24.0, 0.23.2, 0.23.3
>
>
> If a data node is added to the exclude list and the name node is restarted, 
> the decomissioning happens right away on the data node registration. At this 
> point the initial block report has not been sent, so the name node thinks the 
> node has zero blocks and the decomissioning completes very quick, without 
> replicating the blocks on that node.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to