[ 
https://issues.apache.org/jira/browse/HDFS-10816?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16037595#comment-16037595
 ] 

Hudson commented on HDFS-10816:
-------------------------------

SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #11825 (See 
[https://builds.apache.org/job/Hadoop-trunk-Commit/11825/])
HDFS-10816. TestComputeInvalidateWork#testDatanodeReRegistration fails (kihwal: 
rev e4e203e0807fafc5dd765344d008e42bd51cc979)
* (edit) 
hadoop-hdfs-project/hadoop-hdfs/src/test/java/org/apache/hadoop/hdfs/server/blockmanagement/TestComputeInvalidateWork.java


> TestComputeInvalidateWork#testDatanodeReRegistration fails due to race 
> between test and replication monitor
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: HDFS-10816
>                 URL: https://issues.apache.org/jira/browse/HDFS-10816
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>             Fix For: 2.9.0, 3.0.0-alpha4, 2.8.2
>
>         Attachments: HDFS-10816.001.patch, HDFS-10816.002.patch, 
> HDFS-10816.002.patch, HDFS-10816-branch-2.002.patch
>
>
> {noformat}
> java.lang.AssertionError: Expected invalidate blocks to be the number of DNs 
> expected:<3> but was:<2>
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.failNotEquals(Assert.java:743)
>       at org.junit.Assert.assertEquals(Assert.java:118)
>       at org.junit.Assert.assertEquals(Assert.java:555)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestComputeInvalidateWork.testDatanodeReRegistration(TestComputeInvalidateWork.java:160)
> {noformat}
> The test fails because of a race condition between the test and the 
> replication monitor. The default replication monitor interval is 3 seconds, 
> which is just about how long the test normally takes to run. The test deletes 
> a file and then subsequently gets the namesystem writelock. However, if the 
> replication monitor fires in between those two instructions, the test will 
> fail as it will itself invalidate one of the blocks. This can be easily 
> reproduced by removing the sleep() in the ReplicationMonitor's run() method 
> in BlockManager.java, so that the replication monitor executes as quickly as 
> possible and exacerbates the race. 
> To fix the test all that needs to be done is to turn off the replication 
> monitor. 



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: hdfs-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: hdfs-issues-h...@hadoop.apache.org

Reply via email to