[ 
https://issues.apache.org/jira/browse/HDFS-8056?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ming Ma updated HDFS-8056:
--------------------------
    Assignee: Ming Ma
      Status: Patch Available  (was: Open)

> Decommissioned dead nodes should continue to be counted as dead after NN 
> restart
> --------------------------------------------------------------------------------
>
>                 Key: HDFS-8056
>                 URL: https://issues.apache.org/jira/browse/HDFS-8056
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>            Reporter: Ming Ma
>            Assignee: Ming Ma
>         Attachments: HDFS-8056.patch
>
>
> We had some offline discussion with [~andrew.wang] and [~cmccabe] about this. 
> Bring this up for more input and get the patch in place.
> Dead nodes are tracked by {{DatanodeManager}}'s {{datanodeMap}}. However, 
> after NN restarts, those nodes that were dead before NN restart won't be in 
> {{datanodeMap}}. {{DatanodeManager}}'s {{getDatanodeListForReport}} will add 
> those dead nodes, but not if they are in the exclude file.
> {noformat}
>     if (listDeadNodes) {
>       for (InetSocketAddress addr : includedNodes) {
>         if (foundNodes.matchedBy(addr) || excludedNodes.match(addr)) {
>           continue;
>         }
>         // The remaining nodes are ones that are referenced by the hosts
>         // files but that we do not know about, ie that we have never
>         // head from. Eg. an entry that is no longer part of the cluster
>         // or a bogus entry was given in the hosts files
>         //
>         // If the host file entry specified the xferPort, we use that.
>         // Otherwise, we guess that it is the default xfer port.
>         // We can't ask the DataNode what it had configured, because it's
>         // dead.
>         DatanodeDescriptor dn = new DatanodeDescriptor(new DatanodeID(addr
>                 .getAddress().getHostAddress(), addr.getHostName(), "",
>                 addr.getPort() == 0 ? defaultXferPort : addr.getPort(),
>                 defaultInfoPort, defaultInfoSecurePort, defaultIpcPort));
>         setDatanodeDead(dn);
>         nodes.add(dn);
>       }
>     }
> {noformat}
> The issue here is the decommissioned dead node JMX will be different after NN 
> restart. It might be better to make it consistent across NN restart. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to