[ 
https://issues.apache.org/jira/browse/HDFS-4318?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Li Junjun updated HDFS-4318:
----------------------------

    Issue Type: Improvement  (was: Wish)
    
> validateBlockMetadata reduce the success rate of block recover
> --------------------------------------------------------------
>
>                 Key: HDFS-4318
>                 URL: https://issues.apache.org/jira/browse/HDFS-4318
>             Project: Hadoop HDFS
>          Issue Type: Improvement
>          Components: datanode
>    Affects Versions: 1.0.1
>            Reporter: Li Junjun
>            Priority: Minor
>
> some logs like "java.io.IOException: Block blk_3272028001529756059_11883841 
> length is 20480 does not match block file length 21376 "
> when recovery block
> when datanode perform startBlockRecovery  it call validateBlockMetadata( in 
> FSDataset.startBlockRecovery  ),
> check the file lenth match the block's numBytes 
> so , let us see how  block's numBytes was updated in datanode 
> when write block in BlockReceiver.receivePacket , 
> write->flush->setVisibleLength, that means 
> it is  normal and reasonable  that the file length > the block's numBytes if 
> write or flush throw exception . 
> In startBlockRecovery( or other situation,to be check) 
> we just need to guarantee  the file length < the block's numBytes never 
> happens .
> I suggest change the validateBlockMetadata , because it  reduced the success 
> rate of block recover.
> when you have a pipline , a->b->c  when a got error in network , b got error 
> in write->flush , we can only 
> count on c!

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to