premature end-of-decommission of datanodes
------------------------------------------
Key: HADOOP-3423
URL: https://issues.apache.org/jira/browse/HADOOP-3423
Project: Hadoop Core
Issue Type: Bug
Components: dfs
Reporter: dhruba borthakur
Decommissioning requires that the nodes be listed in the dfs.hosts.excludes
file. The administrator runs the "dfsadmin -refreshNodes" command. The
decommissioning process starts off. Suppose that one of the datanodes that was
being decommisioned has to re-register with the namenode. This can occur if the
namenode restarts or if the datanode restarts while the decommissioning was in
progress. Now, the namenode refuses to talk to this datanode because it is in
the excludes list! This is a premature end of the decommissioning process.
One way to fix this bug is to make the namenode always accept registration
requests, even for datanodes that are in the exclude list. The namenode,
however, should set the "being decommissioned" flag for these datanodes. It
should then re-start the decommisioning process for these datanodes. When the
decommissioning is complete, the namenode will ask the datanodes to shutdown.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.