[
https://issues.apache.org/jira/browse/HDFS-6397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13998926#comment-13998926
]
Kihwal Lee commented on HDFS-6397:
----------------------------------
I've noticed that the dead node count does not include the nodes that are in
dfs.include, but never contacted NN. If the ones that contacted NN, but later
died, do count toward the dead node count. So live_node_count +
dead_node_count can be less than total node count from dfs.include.
> NN shows inconsistent value in deadnode count
> ----------------------------------------------
>
> Key: HDFS-6397
> URL: https://issues.apache.org/jira/browse/HDFS-6397
> Project: Hadoop HDFS
> Issue Type: Bug
> Affects Versions: 2.4.1
> Reporter: Mohammad Kamrul Islam
> Assignee: Mohammad Kamrul Islam
> Attachments: HDFS-6397.1.patch, HDFS-6397.2.patch
>
>
> Context:
> When NN is started , without any live datanode but there are nodes in the
> dfs.includes, NN shows the deadcount as '0'.
> There are two inconsistencies:
> 1. If you click on deadnode links (which shows the count is 0), it will
> display the list of deadnodes correctly.
> 2. hadoop 1.x used to display the count correctly.
> The following snippets of JMX response will explain it further:
> Look at the value of "NumDeadDataNodes"
> {noformat}
> {
> "name" : "Hadoop:service=NameNode,name=FSNamesystemState",
> "modelerType" : "org.apache.hadoop.hdfs.server.namenode.FSNamesystem",
> "CapacityTotal" : 0,
> "CapacityUsed" : 0,
> ...
> "NumLiveDataNodes" : 0,
> "NumDeadDataNodes" : 0,
> "NumDecomLiveDataNodes" : 0,
> "NumDecomDeadDataNodes" : 0,
> "NumDecommissioningDataNodes" : 0,
> "NumStaleDataNodes" : 0
> },
> {noformat}
> Look at " "DeadNodes"".
> {noformat}
> {
> "name" : "Hadoop:service=NameNode,name=NameNodeInfo",
> "modelerType" : "org.apache.hadoop.hdfs.server.namenode.FSNamesystem",
>
> ....
> "TotalBlocks" : 70,
> "TotalFiles" : 129,
> "NumberOfMissingBlocks" : 0,
> "LiveNodes" : "{}",
> "DeadNodes" :
> "{\"<MMMMM>.linkedin.com\":{\"lastContact\":1400037397,\"decommissioned\":false,\"xferaddr\":\"172.XX.X.XX:71\"},\"<NNNNN>.linkedin.com\":{\"lastContact\":1400037397,\"decommissioned\":false,\"xferaddr\":\"172.XX.XX.XX:71\"}}",
> "DecomNodes" : "{}",
> .....
> }
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.2#6252)