[ 
https://issues.apache.org/jira/browse/HDFS-5540?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13829622#comment-13829622
 ] 

Binglin Chang commented on HDFS-5540:
-------------------------------------

Read the 
[log|https://builds.apache.org/job/PreCommit-HDFS-Build/5504//testReport/org.apache.hadoop.hdfs.server.blockmanagement/TestBlocksWithNotEnoughRacks/testCorruptBlockRereplicatedAcrossRacks/]

{code}
2013-11-20 17:29:58,638 INFO  DataNode.clienttrace 
(BlockSender.java:sendBlock(734)) - src: /127.0.0.1:45980, dest: 
/127.0.0.1:47450, bytes: 516, op: HDFS_READ, cliID: 
DFSClient_NONMAPREDUCE_1758168951_1, offset: 0, srvID: 
DS-655102145-67.195.138.24-45980-1384968592304, blockid: 
BP-62019746-67.195.138.24-1384968591859:blk_1073741825_1001, duration: 244566
Waiting for 1 corrupt replicas

2013-11-20 17:29:58,660 INFO  BlockStateChange 
(CorruptReplicasMap.java:addToCorruptReplicasMap(88)) - BLOCK 
NameSystem.addToCorruptReplicasMap: blk_1073741825 added as corrupt on 
127.0.0.1:41043 by localhost/127.0.0.1 because client machine reported it

2013-11-20 17:29:59,346 INFO  datanode.DataNode 
(DataXceiver.java:writeBlock(594)) - Received 
BP-62019746-67.195.138.24-1384968591859:blk_1073741825_1001 src: 
/127.0.0.1:39752 dest: /127.0.0.1:49340 of size 512
2013-11-20 17:29:59,347 INFO  BlockStateChange 
(BlockManager.java:logAddStoredBlock(2275)) - BLOCK* addStoredBlock: blockMap 
updated: 127.0.0.1:49340 is added to blk_1073741825_1001 size 512
2013-11-20 17:29:59,347 INFO  BlockStateChange 
(BlockManager.java:invalidateBlock(1092)) - BLOCK* invalidateBlock: 
blk_1073741825_1001(same as stored) on 127.0.0.1:41043

2013-11-20 17:29:59,640 INFO  FSNamesystem.audit 
(FSNamesystem.java:logAuditMessage(7373)) - allowed=true ugi=jenkins 
(auth:SIMPLE) ip=/127.0.0.1 cmd=open  src=/testFile dst=null  perm=null
Waiting for 1 corrupt replicas
{code}

>From the log we can see that, DFSTestUtil.waitCorruptReplicas check corrupt 
>repls every 1 second, but hdfs found and recover the block just within just a 
>second, so DFSTestUtil.waitCorruptReplicas have never detected the corrupt 
>block, causing timeout.



> Fix TestBlocksWithNotEnoughRacks
> --------------------------------
>
>                 Key: HDFS-5540
>                 URL: https://issues.apache.org/jira/browse/HDFS-5540
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Binglin Chang
>            Assignee: Binglin Chang
>
> TestBlocksWithNotEnoughRacks fails with timed out waiting for corrupt replicas
> java.util.concurrent.TimeoutException: Timed out waiting for corrupt 
> replicas. Waiting for 1, but only found 0
>       at 
> org.apache.hadoop.hdfs.DFSTestUtil.waitCorruptReplicas(DFSTestUtil.java:351)
>       at 
> org.apache.hadoop.hdfs.server.blockmanagement.TestBlocksWithNotEnoughRacks.testCorruptBlockRereplicatedAcrossRacks(TestBlocksWithNotEnoughRacks.java:219)



--
This message was sent by Atlassian JIRA
(v6.1#6144)

Reply via email to