Li Junjun created HDFS-4318:
-------------------------------
Summary: validateBlockMetadata reduce the success rate of block
recover
Key: HDFS-4318
URL: https://issues.apache.org/jira/browse/HDFS-4318
Project: Hadoop HDFS
Issue Type: Wish
Components: datanode
Affects Versions: 1.0.1
Reporter: Li Junjun
Priority: Minor
some logs like "java.io.IOException: Block blk_3272028001529756059_11883841
length is 20480 does not match block file length 21376 "
when recovery block
when datanode perform startBlockRecovery it call validateBlockMetadata( in
FSDataset.startBlockRecovery ),
check the file lenth match the block's numBytes
so , let us see how block's numBytes was updated in datanode
when write block in BlockReceiver.receivePacket ,
write->flush->setVisibleLength, that means
it is normal and reasonable that the file length > the block's numBytes if
write or flush throw exception .
In startBlockRecovery( or other situation,to be check)
we just need to guarantee the file length < the block's numBytes never happens
.
I suggest change the validateBlockMetadata , because it reduced the success
rate of block recover.
when you have a pipline , a->b->c when a got error in network , b got error in
write->flush , we can only
count on c!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira