[
https://issues.apache.org/jira/browse/HDFS-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Li Junjun updated HDFS-4318:
----------------------------
Issue Type: Improvement (was: Wish)
> validateBlockMetadata reduce the success rate of block recover
> --------------------------------------------------------------
>
> Key: HDFS-4318
> URL: https://issues.apache.org/jira/browse/HDFS-4318
> Project: Hadoop HDFS
> Issue Type: Improvement
> Components: datanode
> Affects Versions: 1.0.1
> Reporter: Li Junjun
> Priority: Minor
>
> some logs like "java.io.IOException: Block blk_3272028001529756059_11883841
> length is 20480 does not match block file length 21376 "
> when recovery block
> when datanode perform startBlockRecovery it call validateBlockMetadata( in
> FSDataset.startBlockRecovery ),
> check the file lenth match the block's numBytes
> so , let us see how block's numBytes was updated in datanode
> when write block in BlockReceiver.receivePacket ,
> write->flush->setVisibleLength, that means
> it is normal and reasonable that the file length > the block's numBytes if
> write or flush throw exception .
> In startBlockRecovery( or other situation,to be check)
> we just need to guarantee the file length < the block's numBytes never
> happens .
> I suggest change the validateBlockMetadata , because it reduced the success
> rate of block recover.
> when you have a pipline , a->b->c when a got error in network , b got error
> in write->flush , we can only
> count on c!
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira