[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-27 Thread Hudson (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613622#comment-15613622
 ] 

Hudson commented on HDFS-9500:
--

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #10712 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/10712/])
HDFS-9500. Fix software version counts for DataNodes during rolling (shv: rev 
f3ac1f41b8fa82a0ac87a207d7afa2061d90a9bd)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestDatanodeManager.java


> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Erik Krogen
> Fix For: 2.9.0, 2.7.4, 3.0.0-alpha2
>
> Attachments: 9500-v1.patch, HDFS-9500.000.patch, HDFS-9500.001.patch, 
> HDFS-9500.002.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-27 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15613383#comment-15613383
 ] 

Konstantin Shvachko commented on HDFS-9500:
---

Very strange. The build page on Jenkins says
Test Result (no failures)
Guess something went wrong with Jenkins reporting.

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Erik Krogen
> Attachments: 9500-v1.patch, HDFS-9500.000.patch, HDFS-9500.001.patch, 
> HDFS-9500.002.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609303#comment-15609303
 ] 

Hadoop QA commented on HDFS-9500:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
20s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
40s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
58s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
35s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
49s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
56s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
30s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
5s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
12s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
42s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 42m 26s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
21s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 65m 37s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Timed out junit tests | org.apache.hadoop.hdfs.TestLeaseRecovery2 |
|   | org.apache.hadoop.hdfs.TestFileChecksum |
|   | org.apache.hadoop.hdfs.TestParallelShortCircuitLegacyRead |
|   | org.apache.hadoop.hdfs.server.namenode.TestStartup |
|   | org.apache.hadoop.hdfs.TestWriteRead |
|   | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure |
|   | org.apache.hadoop.fs.viewfs.TestViewFileSystemHdfs |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestDatanodeManager |
|   | org.apache.hadoop.hdfs.TestFileCreationClient |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeVolumeFailureReporting 
|
|   | org.apache.hadoop.hdfs.server.datanode.TestNNHandlesBlockReportPerStorage 
|
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeMetrics |
|   | org.apache.hadoop.hdfs.server.datanode.TestNNHandlesCombinedBlockReport |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeHotSwapVolumes |
|   | org.apache.hadoop.hdfs.server.namenode.ha.TestStandbyCheckpoints |
|   | org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | org.apache.hadoop.hdfs.TestMaintenanceState |
|   | org.apache.hadoop.hdfs.server.blockmanagement.TestReplicationPolicy |
|   | org.apache.hadoop.hdfs.server.namenode.TestFileContextAcl |
|   | org.apache.hadoop.fs.TestEnhancedByteBufferAccess |
|   | org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | org.apache.hadoop.hdfs.TestDFSClientFailover |
|   | org.apache.hadoop.hdfs.TestSetrepIncreasing |
|   | org.apache.hadoop.hdfs.TestClientProtocolForPipelineRecovery |
|   | org.apache.hadoop.hdfs.TestDFSStripedOutputStreamWithFailure180 |

[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-26 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609243#comment-15609243
 ] 

Konstantin Shvachko commented on HDFS-9500:
---

+1. Looks good.
Will commit in a bit.

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Erik Krogen
> Attachments: 9500-v1.patch, HDFS-9500.000.patch, HDFS-9500.001.patch, 
> HDFS-9500.002.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609175#comment-15609175
 ] 

Hadoop QA commented on HDFS-9500:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
21s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
2s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
31s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
8s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  2m  
4s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
45s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  1m 
 1s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  1m  
0s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
26s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
51s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
10s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
40s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 61m 56s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
17s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 84m  6s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | 
hadoop.hdfs.server.namenode.TestNameNodeMetadataConsistency |
| Timed out junit tests | 
org.apache.hadoop.hdfs.server.namenode.TestLargeDirectoryDelete |
|   | org.apache.hadoop.hdfs.TestDFSUpgradeFromImage |
|   | org.apache.hadoop.hdfs.TestInjectionForSimulatedStorage |
|   | org.apache.hadoop.hdfs.server.namenode.TestNameNodeReconfigure |
|   | org.apache.hadoop.hdfs.TestLeaseRecovery |
|   | org.apache.hadoop.hdfs.server.namenode.TestFileTruncate |
|   | org.apache.hadoop.hdfs.TestRestartDFS |
|   | org.apache.hadoop.hdfs.server.namenode.TestDeleteRace |
|   | org.apache.hadoop.hdfs.TestLeaseRecoveryStriped |
|   | org.apache.hadoop.hdfs.server.namenode.TestAuditLogs |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-9500 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12835378/HDFS-9500.001.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux bbf763f21909 3.13.0-96-generic #143-Ubuntu SMP Mon Aug 29 
20:15:20 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 1f8490a |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17293/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17293/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 

[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-26 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15609059#comment-15609059
 ] 

Ravi Prakash commented on HDFS-9500:


Thanks for your patch Erik and your review Konst!
bq. // Check isAlive too because right after removeDatanode(),
bq.// isDatanodeDead() is still true 
This comment is not valid anymore after your changes. Could you please fix this 
too?


> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Erik Krogen
> Attachments: 9500-v1.patch, HDFS-9500.000.patch, HDFS-9500.001.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-25 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15607096#comment-15607096
 ] 

Konstantin Shvachko commented on HDFS-9500:
---

Good find Erik. I agree we should decrement version count whenever DN is alive. 
The heartbeat interval expiration doesn't matter here because the node will be 
marked alive by that same method. It should also work for the full version 
recount in {{countSoftwareVersions()}}. Two nits
# Could you please add Javadoc
#* to {{shouldCountVersion()}} saying we count versions for all alive nodes
#* and to the new test explaining its purpose
# You do not need to use Mockito for setting fields in 
{{DatanodeRegistration}}. Can't you just use a constructor?

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Erik Krogen
> Attachments: 9500-v1.patch, HDFS-9500.000.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-25 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606385#comment-15606385
 ] 

Erik Krogen commented on HDFS-9500:
---

The {{TestDiskspaceQuotaUpdate}} failure is unrelated and documented in 
HDFS-10921. The other 5 tests all pass locally. 

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Erik Krogen
> Attachments: 9500-v1.patch, HDFS-9500.000.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-25 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15606316#comment-15606316
 ] 

Hadoop QA commented on HDFS-9500:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m 
17s{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green}  0m  
0s{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:green}+1{color} | {color:green} test4tests {color} | {color:green}  0m 
 0s{color} | {color:green} The patch appears to include 1 new or modified test 
files. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  8m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
57s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
32s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  1m  
3s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
14s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
52s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
41s{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green}  0m 
45s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green}  0m 
41s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green}  0m 
23s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green}  0m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green}  0m 
 9s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green}  0m 
 0s{color} | {color:green} The patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green}  1m 
48s{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green}  0m 
36s{color} | {color:green} the patch passed {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 43m  7s{color} 
| {color:red} hadoop-hdfs in the patch failed. {color} |
| {color:green}+1{color} | {color:green} asflicense {color} | {color:green}  0m 
20s{color} | {color:green} The patch does not generate ASF License warnings. 
{color} |
| {color:black}{color} | {color:black} {color} | {color:black} 63m 57s{color} | 
{color:black} {color} |
\\
\\
|| Reason || Tests ||
| Failed junit tests | hadoop.hdfs.server.namenode.TestDiskspaceQuotaUpdate |
| Timed out junit tests | 
org.apache.hadoop.hdfs.server.datanode.TestDataNodeRollingUpgrade |
|   | org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestLazyWriter |
|   | org.apache.hadoop.tracing.TestTracing |
|   | 
org.apache.hadoop.hdfs.server.datanode.fsdataset.impl.TestSpaceReservation |
|   | org.apache.hadoop.hdfs.server.datanode.TestFsDatasetCache |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:9560f25 |
| JIRA Issue | HDFS-9500 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12835171/HDFS-9500.000.patch |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux a1d58e835aba 3.13.0-92-generic #139-Ubuntu SMP Tue Jun 28 
20:42:26 UTC 2016 x86_64 x86_64 x86_64 GNU/Linux |
| Build tool | maven |
| Personality | /testptch/hadoop/patchprocess/precommit/personality/provided.sh 
|
| git revision | trunk / 9a8a386 |
| Default Java | 1.8.0_101 |
| findbugs | v3.0.0 |
| unit | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17278/artifact/patchprocess/patch-unit-hadoop-hdfs-project_hadoop-hdfs.txt
 |
|  Test Results | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17278/testReport/ |
| modules | C: hadoop-hdfs-project/hadoop-hdfs U: 
hadoop-hdfs-project/hadoop-hdfs |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/17278/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> datanodesSoftwareVersions map may counting wrong 

[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-24 Thread Erik Krogen (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15603695#comment-15603695
 ] 

Erik Krogen commented on HDFS-9500:
---

Assigning this to myself since Phil does not seem to be actively working on it 
anymore.

I can (intermittently) reproduce this test failure on branch-2.7 if I increase 
the number of iterations on 
{{TestDatanodeManager.testNumVersionsReportedCorrect}} to 5000. I found that 
for the node whose version should have been decremented, 
{{shouldDecrementVersion()}} returned false because {{isDatanodeDead()}} was 
true (but {{isAlive}} was also true). 

It seems this situation could arise if the time since the last heartbeat from 
the node was above the threshold to determine it is as dead, but the 
{{HeartbeatManager}} had not yet done so. I am open to suggestions about this. 
Would just checking {{DatanodeDescriptor.isAlive}} be sufficient here instead 
of the check on both {{isAlive}} and {{isDatanodeDead()}}? 

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Erik Krogen
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-10-18 Thread Konstantin Shvachko (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15586504#comment-15586504
 ] 

Konstantin Shvachko commented on HDFS-9500:
---

Sounds like the condition for decrementing the old version is not accurate. I 
see this on trunk and other versions.
[~yangzhe1991] do you still plan to work on it?

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-08-17 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15425757#comment-15425757
 ] 

Hadoop QA commented on HDFS-9500:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue}  0m  
0s{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red}  0m  5s{color} 
| {color:red} HDFS-9500 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12775497/9500-v1.patch |
| JIRA Issue | HDFS-9500 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/16465/console |
| Powered by | Apache Yetus 0.4.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org



[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-02-26 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15170009#comment-15170009
 ] 

Ravi Prakash commented on HDFS-9500:


I see HDFS-9371 did away with the finer grained locking we had implemented in 
{{incrementVersionCount}} and {{decrementVersionCount}} (earlier we were 
synchronizing on datanodeMap, and now we synchronize on the entire 
DatanodeManager). 

bq. I think this issue is caused by DatanodeManager.registerDatanode. If nodeS 
in old version is not alive because of shutting down, it will not pass 
shouldCountVersion, so the number of old version won't be decreased. But this 
method only judges the status of heartbeat and isAlive at that moment, if 
namenode has not removed this node which will decrease the version map and this 
node restarts in the new version, the decrementVersionCount belongs to this 
node will never be executed.

Thanks for the analysis [~yangzhe1991]! Could you please help me understand it? 
Which version of Hadoop did you experience this on? How do you update the 
version of the DNs? Do you let a long time pass between bringing down the DN in 
the old version and then bringing back a DN with the new version?
What state is the Datanode in when its old version is not decremented?

Wouldn't 
https://github.com/apache/hadoop/blob/branch-2.6.2/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/blockmanagement/DatanodeManager.java#L520
 decrement the version count? 

[~kihwal] Are you seeing this too?

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-02-26 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169705#comment-15169705
 ] 

Hadoop QA commented on HDFS-9500:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:red}-1{color} | {color:red} patch {color} | {color:red} 0m 5s {color} 
| {color:red} HDFS-9500 does not apply to trunk. Rebase required? Wrong Branch? 
See https://wiki.apache.org/hadoop/HowToContribute for help. {color} |
\\
\\
|| Subsystem || Report/Notes ||
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12775497/9500-v1.patch |
| JIRA Issue | HDFS-9500 |
| Console output | 
https://builds.apache.org/job/PreCommit-HDFS-Build/14633/console |
| Powered by | Apache Yetus 0.2.0-SNAPSHOT   http://yetus.apache.org |


This message was automatically generated.



> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-02-26 Thread Ravi Prakash (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15169557#comment-15169557
 ] 

Ravi Prakash commented on HDFS-9500:


Thanks for reporting the issue Phil and for bringing it to my attention Kihwal! 
Sorry for the late reply (my inbox ate mail) :(

We wanted to avoid counting the version map for every registration because as 
Kihwal points out, it could be an expensive operation for big clusters (and you 
have to do it for every registration / dead node.) I'd be in favor of fixing 
the problem without having to count all the versions every time, although if 
its too onerous, I'm fine with Phil's proposal as well. I'll look through the 
code to see if there's an easy fix


> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-01-27 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15120148#comment-15120148
 ] 

Junping Du commented on HDFS-9500:
--

Move all non-critical pending issues out of 2.6.4 into 2.6.5.

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-01-20 Thread Kihwal Lee (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15109452#comment-15109452
 ] 

Kihwal Lee commented on HDFS-9500:
--

No doubt this op will be hot during start-up of a several thousand node 
cluster. 
What do you think [~raviprak]?

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2016-01-17 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15103980#comment-15103980
 ] 

Junping Du commented on HDFS-9500:
--

bq. So the simplest way to fix this is that we always recounting the version 
map in registerDatanode since it is not a heavy operation.
[~kihwal], Do we have any concern for this solution when scale of the cluster 
is really huge?

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)


[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2015-12-03 Thread Hadoop QA (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037555#comment-15037555
 ] 

Hadoop QA commented on HDFS-9500:
-

| (x) *{color:red}-1 overall{color}* |
\\
\\
|| Vote || Subsystem || Runtime || Comment ||
| {color:blue}0{color} | {color:blue} reexec {color} | {color:blue} 0m 0s 
{color} | {color:blue} Docker mode activated. {color} |
| {color:green}+1{color} | {color:green} @author {color} | {color:green} 0m 0s 
{color} | {color:green} The patch does not contain any @author tags. {color} |
| {color:red}-1{color} | {color:red} test4tests {color} | {color:red} 0m 0s 
{color} | {color:red} The patch doesn't appear to include any new or modified 
tests. Please justify why no new tests are needed for this patch. Also please 
list what manual steps were performed to verify this patch. {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 8m 
22s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 44s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 5s 
{color} | {color:green} trunk passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 9s 
{color} | {color:green} trunk passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 52s 
{color} | {color:green} trunk passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} mvninstall {color} | {color:green} 0m 
52s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} compile {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:green}+1{color} | {color:green} javac {color} | {color:green} 0m 45s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} checkstyle {color} | {color:green} 0m 
17s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvnsite {color} | {color:green} 0m 57s 
{color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} mvneclipse {color} | {color:green} 0m 
14s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} whitespace {color} | {color:green} 0m 
0s {color} | {color:green} Patch has no whitespace issues. {color} |
| {color:green}+1{color} | {color:green} findbugs {color} | {color:green} 2m 
12s {color} | {color:green} the patch passed {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 12s 
{color} | {color:green} the patch passed with JDK v1.8.0_66 {color} |
| {color:green}+1{color} | {color:green} javadoc {color} | {color:green} 1m 51s 
{color} | {color:green} the patch passed with JDK v1.7.0_85 {color} |
| {color:red}-1{color} | {color:red} unit {color} | {color:red} 56m 26s {color} 
| {color:red} hadoop-hdfs in the patch failed with JDK v1.8.0_66. {color} |
| {color:green}+1{color} | {color:green} unit {color} | {color:green} 53m 30s 
{color} | {color:green} hadoop-hdfs in the patch passed with JDK v1.7.0_85. 
{color} |
| {color:red}-1{color} | {color:red} asflicense {color} | {color:red} 0m 23s 
{color} | {color:red} Patch generated 59 ASF License warnings. {color} |
| {color:black}{color} | {color:black} {color} | {color:black} 138m 44s {color} 
| {color:black} {color} |
\\
\\
|| Reason || Tests ||
| JDK v1.8.0_66 Failed junit tests | hadoop.hdfs.TestDatanodeRegistration |
|   | hadoop.hdfs.server.namenode.ha.TestSeveralNameNodes |
|   | hadoop.hdfs.TestFileAppend |
\\
\\
|| Subsystem || Report/Notes ||
| Docker |  Image:yetus/hadoop:0ca8df7 |
| JIRA Patch URL | 
https://issues.apache.org/jira/secure/attachment/12775497/9500-v1.patch |
| JIRA Issue | HDFS-9500 |
| Optional Tests |  asflicense  compile  javac  javadoc  mvninstall  mvnsite  
unit  findbugs  checkstyle  |
| uname | Linux c6f8a79943f4 3.13.0-36-lowlatency #63-Ubuntu SMP PREEMPT Wed 

[jira] [Commented] (HDFS-9500) datanodesSoftwareVersions map may counting wrong when rolling upgrade

2015-12-03 Thread Junping Du (JIRA)

[ 
https://issues.apache.org/jira/browse/HDFS-9500?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15037832#comment-15037832
 ] 

Junping Du commented on HDFS-9500:
--

Moving non-blocker/non-critical issues out of 2.6.3 into 2.6.4.

> datanodesSoftwareVersions map may counting wrong when rolling upgrade
> -
>
> Key: HDFS-9500
> URL: https://issues.apache.org/jira/browse/HDFS-9500
> Project: Hadoop HDFS
>  Issue Type: Bug
>Affects Versions: 2.7.1, 2.6.2
>Reporter: Phil Yang
>Assignee: Phil Yang
> Attachments: 9500-v1.patch
>
>
> While rolling upgrading, namenode's website overview will report there are 
> two versions datanodes in the cluster, for example, 2.6.0 has x nodes and 
> 2.6.2 has y nodes. However, sometimes when I stop a datanode in old version 
> and start a new version one, namenode only increases the number of new 
> version but not decreases the number of old version. So the total number x+y 
> will be larger than the number of datanodes. Even all datanodes are upgraded, 
> there will still have the messages that there are several datanode in old 
> version. And I must run hdfs dfsadmin -refreshNodes to clear this message.
> I think this issue is caused by DatanodeManager.registerDatanode. If nodeS in 
> old version is not alive because of shutting down, it will not pass 
> shouldCountVersion, so the number of old version won't be decreased. But this 
> method only judges the status of heartbeat and isAlive at that moment, if 
> namenode has not removed this node which will decrease the version map and 
> this node restarts in the new version, the decrementVersionCount belongs to 
> this node will never be executed.
> So the simplest way to fix this is that we always recounting the version map 
> in registerDatanode since it is not a heavy operation.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)