Is that a bug of hadoop fsck or dfsadmin? As they really did not defect the missing replica data on a datanode.
2014-11-13 13:43 GMT+08:00 sam liu <[email protected]>: > Hi Experts, > > In my hdfs, there is a file named /tmp/test.txt belonging to 1 block with > 2 replica. The block id is blk_1073742304_1480 and the 2 replica resides on > datanode1 and datanode2. > > Today I manually removed the block file on datanode2: > ./current/BP-1640683473-9.181.64.230-1415757100604/current/finalized/subdir52/blk_1073742304. > And then, I failed to read hdfs /tmp/test.txt file from datanode2, and > encountered an exception: "IOException: Got error for OP_READ_BLOCK...". It > makes sense as I already removed one replica from datanod2. > > However, both 'hadoop fsck /tmp/test.txt -files -blocks -locations' and > 'hadoop dfsadmin -report' say hdfs is healthy and no replica is missed. > Even after waiting several minutes(I think datanode will send heartbeats to > namenode to report the recent status), the fsck/dfsadmin tools still did > not find the replica missing. Why? > > Thanks! >
