[
https://issues.apache.org/jira/browse/HDFS-6250?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13981701#comment-13981701
]
Chen He commented on HDFS-6250:
-------------------------------
I think this undeleted data block is caused by a race condition:
The testBalancerWithRackLocality uses getDfsUsed() methods to count the HDFS
usage. When the test code checks tdatanode usage, getDfsUsed() has not updated
yet.
Here is whole process.
1) ReplicationMonitor runs every 5 seconds to check data blocks that need to be
updated and send them to NN. We need 10 seconds to guarantee any information
reaches NN;
2) Once NN gets operations, it sends them to corresponding DNs. It needs 6
seconds (2 heartbeat intervals, assume heartbeat interval is 3 seconds) to let
DNs finish these operations and report updates back to NN;
3) if we consider other processing time, it will be safe to get latest
information in 20 seconds.
Based on analysis above, I attached my patch.
> TestBalancerWithNodeGroup.testBalancerWithRackLocality fails
> ------------------------------------------------------------
>
> Key: HDFS-6250
> URL: https://issues.apache.org/jira/browse/HDFS-6250
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Kihwal Lee
> Assignee: Chen He
> Attachments: test_log.txt
>
>
> It was seen in https://builds.apache.org/job/PreCommit-HDFS-Build/6669/
> {panel}
> java.lang.AssertionError: expected:<1800> but was:<1810>
> at org.junit.Assert.fail(Assert.java:93)
> at org.junit.Assert.failNotEquals(Assert.java:647)
> at org.junit.Assert.assertEquals(Assert.java:128)
> at org.junit.Assert.assertEquals(Assert.java:147)
> at org.apache.hadoop.hdfs.server.balancer.TestBalancerWithNodeGroup
> .testBalancerWithRackLocality(TestBalancerWithNodeGroup.java:253)
> {panel}
--
This message was sent by Atlassian JIRA
(v6.2#6252)