[
https://issues.apache.org/jira/browse/HDFS-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13829622#comment-13829622
]
Binglin Chang commented on HDFS-5540:
-------------------------------------
Read the
[log|https://builds.apache.org/job/PreCommit-HDFS-Build/5504//testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlocksWithNotEnoughRacks/testCorruptBlockRereplicatedAcrossRacks/]
{code}
2013-11-20 17:29:58,638 INFO DataNode.clienttrace
(BlockSender.java:sendBlock(734)) - src: /127.0.0.1:45980, dest:
/127.0.0.1:47450, bytes: 516, op: HDFS_READ, cliID:
DFSClient_NONMAPREDUCE_1758168951_1, offset: 0, srvID:
DS-655102145-67.195.138.24-45980-1384968592304, blockid:
BP-62019746-67.195.138.24-1384968591859:blk_1073741825_1001, duration: 244566
Waiting for 1 corrupt replicas
2013-11-20 17:29:58,660 INFO BlockStateChange
(CorruptReplicasMap.java:addToCorruptReplicasMap(88)) - BLOCK
NameSystem.addToCorruptReplicasMap: blk_1073741825 added as corrupt on
127.0.0.1:41043 by localhost/127.0.0.1 because client machine reported it
2013-11-20 17:29:59,346 INFO datanode.DataNode
(DataXceiver.java:writeBlock(594)) - Received
BP-62019746-67.195.138.24-1384968591859:blk_1073741825_1001 src:
/127.0.0.1:39752 dest: /127.0.0.1:49340 of size 512
2013-11-20 17:29:59,347 INFO BlockStateChange
(BlockManager.java:logAddStoredBlock(2275)) - BLOCK* addStoredBlock: blockMap
updated: 127.0.0.1:49340 is added to blk_1073741825_1001 size 512
2013-11-20 17:29:59,347 INFO BlockStateChange
(BlockManager.java:invalidateBlock(1092)) - BLOCK* invalidateBlock:
blk_1073741825_1001(same as stored) on 127.0.0.1:41043
2013-11-20 17:29:59,640 INFO FSNamesystem.audit
(FSNamesystem.java:logAuditMessage(7373)) - allowed=true ugi=jenkins
(auth:SIMPLE) ip=/127.0.0.1 cmd=open src=/testFile dst=null perm=null
Waiting for 1 corrupt replicas
{code}
>From the log we can see that, DFSTestUtil.waitCorruptReplicas check corrupt
>repls every 1 second, but hdfs found and recover the block just within just a
>second, so DFSTestUtil.waitCorruptReplicas have never detected the corrupt
>block, causing timeout.
> Fix TestBlocksWithNotEnoughRacks
> --------------------------------
>
> Key: HDFS-5540
> URL: https://issues.apache.org/jira/browse/HDFS-5540
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Binglin Chang
> Assignee: Binglin Chang
>
> TestBlocksWithNotEnoughRacks fails with timed out waiting for corrupt replicas
> java.util.concurrent.TimeoutException: Timed out waiting for corrupt
> replicas. Waiting for 1, but only found 0
> at
> org.apache.hadoop.hdfs.DFSTestUtil.waitCorruptReplicas(DFSTestUtil.java:351)
> at
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:219)
--
This message was sent by Atlassian JIRA
(v6.1#6144)