Phil Yang created HDFS-9500:
-------------------------------
Summary: datanodesSoftwareVersions map may counting wrong when
rolling upgrade
Key: HDFS-9500
URL: https://issues.apache.org/jira/browse/HDFS-9500
Project: Hadoop HDFS
Issue Type: Bug
Affects Versions: 2.6.2, 2.7.1
Reporter: Phil Yang
Assignee: Phil Yang
While rolling upgrading, namenode's website overview will report there are two
versions datanodes in the cluster, for example, 2.6.0 has x nodes and 2.6.2 has
y nodes. However, sometimes when I stop a datanode in old version and start a
new version one, namenode only increases the number of new version but not
decreases the number of old version. So the total number x+y will be larger
than the number of datanodes. Even all datanodes are upgraded, there will still
have the messages that there are several datanode in old version. And I must
run hdfs dfsadmin -refreshNodes to clear this message.
I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in
old version is not alive because of shutting down, it will not pass
shouldCountVersion, so the number of old version won't be decreased. But this
method only judges the status of heartbeat and isAlive on the moment, if
namenode has not been noticed and removed this node and this node restarts in
the new version, the decrementVersionCount belongs to this node will never be
executed.
So the simplest fix is that we always recounting the version map in
registerDatanode since it is not a heavy operation.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)