[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613622#comment-15613622 ] Hudson commented on HDFS-9500: -- SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10712 (See [https://builds.apache.org/job/Hadoop-trunk-Commit/10712/]) HDFS-9500. Fix software version counts for DataNodes during rolling (shv: rev f3ac1f41b8fa82a0ac87a207d7afa2061d90a9bd) * (edit) hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java * (edit) hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Erik Krogen > Fix For: 2.9.0, 2.7.4, 3.0.0-alpha2 > > Attachments: 9500-v1.patch, HDFS-9500.000.patch, HDFS-9500.001.patch, > HDFS-9500.002.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613383#comment-15613383 ] Konstantin Shvachko commented on HDFS-9500: --- Very strange. The build page on Jenkins says Test Result (no failures) Guess something went wrong with Jenkins reporting. > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Erik Krogen > Attachments: 9500-v1.patch, HDFS-9500.000.patch, HDFS-9500.001.patch, > HDFS-9500.002.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609303#comment-15609303 ] Hadoop QA commented on HDFS-9500: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 20s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 40s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 58s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 35s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 49s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 56s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 30s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 5s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 12s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 42s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 26s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 21s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 65m 37s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 | | | org.apache.hadoop.hdfs.TestFileChecksum | | | org.apache.hadoop.hdfs.TestParallelShortCircuitLegacyRead | | | org.apache.hadoop.hdfs.server.namenode.TestStartup | | | org.apache.hadoop.hdfs.TestWriteRead | | | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure | | | org.apache.hadoop.fs.viewfs.TestViewFileSystemHdfs | | | org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager | | | org.apache.hadoop.hdfs.TestFileCreationClient | | | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting | | | org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage | | | org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics | | | org.apache.hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport | | | org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes | | | org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints | | | org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | | | org.apache.hadoop.hdfs.TestMaintenanceState | | | org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy | | | org.apache.hadoop.hdfs.server.namenode.TestFileContextAcl | | | org.apache.hadoop.fs.TestEnhancedByteBufferAccess | | | org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete | | | org.apache.hadoop.hdfs.TestDFSClientFailover | | | org.apache.hadoop.hdfs.TestSetrepIncreasing | | | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery | | | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609243#comment-15609243 ] Konstantin Shvachko commented on HDFS-9500: --- +1. Looks good. Will commit in a bit. > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Erik Krogen > Attachments: 9500-v1.patch, HDFS-9500.000.patch, HDFS-9500.001.patch, > HDFS-9500.002.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609175#comment-15609175 ] Hadoop QA commented on HDFS-9500: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 21s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 2s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 31s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 8s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 4s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 45s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 1m 1s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 1m 0s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 26s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 51s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 10s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 40s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 56s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 17s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 84m 6s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency | | Timed out junit tests | org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete | | | org.apache.hadoop.hdfs.TestDFSUpgradeFromImage | | | org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage | | | org.apache.hadoop.hdfs.server.namenode.TestNameNodeReconfigure | | | org.apache.hadoop.hdfs.TestLeaseRecovery | | | org.apache.hadoop.hdfs.server.namenode.TestFileTruncate | | | org.apache.hadoop.hdfs.TestRestartDFS | | | org.apache.hadoop.hdfs.server.namenode.TestDeleteRace | | | org.apache.hadoop.hdfs.TestLeaseRecoveryStriped | | | org.apache.hadoop.hdfs.server.namenode.TestAuditLogs | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-9500 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835378/HDFS-9500.001.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux bbf763f21909 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 1f8490a | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17293/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17293/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U:
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609059#comment-15609059 ] Ravi Prakash commented on HDFS-9500: Thanks for your patch Erik and your review Konst! bq. // Check isAlive too because right after removeDatanode(), bq.// isDatanodeDead() is still true This comment is not valid anymore after your changes. Could you please fix this too? > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Erik Krogen > Attachments: 9500-v1.patch, HDFS-9500.000.patch, HDFS-9500.001.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607096#comment-15607096 ] Konstantin Shvachko commented on HDFS-9500: --- Good find Erik. I agree we should decrement version count whenever DN is alive. The heartbeat interval expiration doesn't matter here because the node will be marked alive by that same method. It should also work for the full version recount in {{countSoftwareVersions()}}. Two nits # Could you please add Javadoc #* to {{shouldCountVersion()}} saying we count versions for all alive nodes #* and to the new test explaining its purpose # You do not need to use Mockito for setting fields in {{DatanodeRegistration}}. Can't you just use a constructor? > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Erik Krogen > Attachments: 9500-v1.patch, HDFS-9500.000.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606385#comment-15606385 ] Erik Krogen commented on HDFS-9500: --- The {{TestDiskspaceQuotaUpdate}} failure is unrelated and documented in HDFS-10921. The other 5 tests all pass locally. > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Erik Krogen > Attachments: 9500-v1.patch, HDFS-9500.000.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606316#comment-15606316 ] Hadoop QA commented on HDFS-9500: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 17s{color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s{color} | {color:green} The patch does not contain any @author tags. {color} | | {color:green}+1{color} | {color:green} test4tests {color} | {color:green} 0m 0s{color} | {color:green} The patch appears to include 1 new or modified test files. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 57s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 32s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 1m 3s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 52s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 41s{color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 45s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 41s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 23s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 9s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s{color} | {color:green} The patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 1m 48s{color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 0m 36s{color} | {color:green} the patch passed {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 43m 7s{color} | {color:red} hadoop-hdfs in the patch failed. {color} | | {color:green}+1{color} | {color:green} asflicense {color} | {color:green} 0m 20s{color} | {color:green} The patch does not generate ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 63m 57s{color} | {color:black} {color} | \\ \\ || Reason || Tests || | Failed junit tests | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate | | Timed out junit tests | org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade | | | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter | | | org.apache.hadoop.tracing.TestTracing | | | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation | | | org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:9560f25 | | JIRA Issue | HDFS-9500 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12835171/HDFS-9500.000.patch | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux a1d58e835aba 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux | | Build tool | maven | | Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh | | git revision | trunk / 9a8a386 | | Default Java | 1.8.0_101 | | findbugs | v3.0.0 | | unit | https://builds.apache.org/job/PreCommit-HDFS-Build/17278/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt | | Test Results | https://builds.apache.org/job/PreCommit-HDFS-Build/17278/testReport/ | | modules | C: hadoop-hdfs-project/hadoop-hdfs U: hadoop-hdfs-project/hadoop-hdfs | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/17278/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > datanodesSoftwareVersions map may counting wrong
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603695#comment-15603695 ] Erik Krogen commented on HDFS-9500: --- Assigning this to myself since Phil does not seem to be actively working on it anymore. I can (intermittently) reproduce this test failure on branch-2.7 if I increase the number of iterations on {{TestDatanodeManager.testNumVersionsReportedCorrect}} to 5000. I found that for the node whose version should have been decremented, {{shouldDecrementVersion()}} returned false because {{isDatanodeDead()}} was true (but {{isAlive}} was also true). It seems this situation could arise if the time since the last heartbeat from the node was above the threshold to determine it is as dead, but the {{HeartbeatManager}} had not yet done so. I am open to suggestions about this. Would just checking {{DatanodeDescriptor.isAlive}} be sufficient here instead of the check on both {{isAlive}} and {{isDatanodeDead()}}? > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Erik Krogen > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586504#comment-15586504 ] Konstantin Shvachko commented on HDFS-9500: --- Sounds like the condition for decrementing the old version is not accurate. I see this on trunk and other versions. [~yangzhe1991] do you still plan to work on it? > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425757#comment-15425757 ] Hadoop QA commented on HDFS-9500: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s{color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s{color} | {color:red} HDFS-9500 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12775497/9500-v1.patch | | JIRA Issue | HDFS-9500 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/16465/console | | Powered by | Apache Yetus 0.4.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332) - To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170009#comment-15170009 ] Ravi Prakash commented on HDFS-9500: I see HDFS-9371 did away with the finer grained locking we had implemented in {{incrementVersionCount}} and {{decrementVersionCount}} (earlier we were synchronizing on datanodeMap, and now we synchronize on the entire DatanodeManager). bq. I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in old version is not alive because of shutting down, it will not pass shouldCountVersion, so the number of old version won't be decreased. But this method only judges the status of heartbeat and isAlive at that moment, if namenode has not removed this node which will decrease the version map and this node restarts in the new version, the decrementVersionCount belongs to this node will never be executed. Thanks for the analysis [~yangzhe1991]! Could you please help me understand it? Which version of Hadoop did you experience this on? How do you update the version of the DNs? Do you let a long time pass between bringing down the DN in the old version and then bringing back a DN with the new version? What state is the Datanode in when its old version is not decremented? Wouldn't https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L520 decrement the version count? [~kihwal] Are you seeing this too? > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169705#comment-15169705 ] Hadoop QA commented on HDFS-9500: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} | {color:red} HDFS-9500 does not apply to trunk. Rebase required? Wrong Branch? See https://wiki.apache.org/hadoop/HowToContribute for help. {color} | \\ \\ || Subsystem || Report/Notes || | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12775497/9500-v1.patch | | JIRA Issue | HDFS-9500 | | Console output | https://builds.apache.org/job/PreCommit-HDFS-Build/14633/console | | Powered by | Apache Yetus 0.2.0-SNAPSHOT http://yetus.apache.org | This message was automatically generated. > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169557#comment-15169557 ] Ravi Prakash commented on HDFS-9500: Thanks for reporting the issue Phil and for bringing it to my attention Kihwal! Sorry for the late reply (my inbox ate mail) :( We wanted to avoid counting the version map for every registration because as Kihwal points out, it could be an expensive operation for big clusters (and you have to do it for every registration / dead node.) I'd be in favor of fixing the problem without having to count all the versions every time, although if its too onerous, I'm fine with Phil's proposal as well. I'll look through the code to see if there's an easy fix > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120148#comment-15120148 ] Junping Du commented on HDFS-9500: -- Move all non-critical pending issues out of 2.6.4 into 2.6.5. > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109452#comment-15109452 ] Kihwal Lee commented on HDFS-9500: -- No doubt this op will be hot during start-up of a several thousand node cluster. What do you think [~raviprak]? > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103980#comment-15103980 ] Junping Du commented on HDFS-9500: -- bq. So the simplest way to fix this is that we always recounting the version map in registerDatanode since it is not a heavy operation. [~kihwal], Do we have any concern for this solution when scale of the cluster is really huge? > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037555#comment-15037555 ] Hadoop QA commented on HDFS-9500: - | (x) *{color:red}-1 overall{color}* | \\ \\ || Vote || Subsystem || Runtime || Comment || | {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s {color} | {color:blue} Docker mode activated. {color} | | {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s {color} | {color:green} The patch does not contain any @author tags. {color} | | {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s {color} | {color:red} The patch doesn't appear to include any new or modified tests. Please justify why no new tests are needed for this patch. Also please list what manual steps were performed to verify this patch. {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 22s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} trunk passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s {color} | {color:green} trunk passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s {color} | {color:green} trunk passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s {color} | {color:green} trunk passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 52s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 17s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 14s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 0s {color} | {color:green} Patch has no whitespace issues. {color} | | {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 12s {color} | {color:green} the patch passed {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s {color} | {color:green} the patch passed with JDK v1.8.0_66 {color} | | {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s {color} | {color:green} the patch passed with JDK v1.7.0_85 {color} | | {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 26s {color} | {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} | | {color:green}+1{color} | {color:green} unit {color} | {color:green} 53m 30s {color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_85. {color} | | {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s {color} | {color:red} Patch generated 59 ASF License warnings. {color} | | {color:black}{color} | {color:black} {color} | {color:black} 138m 44s {color} | {color:black} {color} | \\ \\ || Reason || Tests || | JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.TestDatanodeRegistration | | | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes | | | hadoop.hdfs.TestFileAppend | \\ \\ || Subsystem || Report/Notes || | Docker | Image:yetus/hadoop:0ca8df7 | | JIRA Patch URL | https://issues.apache.org/jira/secure/attachment/12775497/9500-v1.patch | | JIRA Issue | HDFS-9500 | | Optional Tests | asflicense compile javac javadoc mvninstall mvnsite unit findbugs checkstyle | | uname | Linux c6f8a79943f4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed
[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade
[ https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037832#comment-15037832 ] Junping Du commented on HDFS-9500: -- Moving non-blocker/non-critical issues out of 2.6.3 into 2.6.4. > datanodesSoftwareVersions map may counting wrong when rolling upgrade > - > > Key: HDFS-9500 > URL: https://issues.apache.org/jira/browse/HDFS-9500 > Project: Hadoop HDFS > Issue Type: Bug >Affects Versions: 2.7.1, 2.6.2 >Reporter: Phil Yang >Assignee: Phil Yang > Attachments: 9500-v1.patch > > > While rolling upgrading, namenode's website overview will report there are > two versions datanodes in the cluster, for example, 2.6.0 has x nodes and > 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version > and start a new version one, namenode only increases the number of new > version but not decreases the number of old version. So the total number x+y > will be larger than the number of datanodes. Even all datanodes are upgraded, > there will still have the messages that there are several datanode in old > version. And I must run hdfs dfsadmin -refreshNodes to clear this message. > I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in > old version is not alive because of shutting down, it will not pass > shouldCountVersion, so the number of old version won't be decreased. But this > method only judges the status of heartbeat and isAlive at that moment, if > namenode has not removed this node which will decrease the version map and > this node restarts in the new version, the decrementVersionCount belongs to > this node will never be executed. > So the simplest way to fix this is that we always recounting the version map > in registerDatanode since it is not a heavy operation. -- This message was sent by Atlassian JIRA (v6.3.4#6332)