[jira] Commented: (HDFS-29) In Datanode, update block may fail due to length inconsistency

Tsz Wo (Nicholas), SZE (JIRA) Mon, 12 Oct 2009 16:29:56 -0700

    [ 
https://issues.apache.org/jira/browse/HDFS-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764893#action_12764893
 ]


Tsz Wo (Nicholas), SZE commented on HDFS-29:
--------------------------------------------

After a series of HDFS-265 sub-task patches, getBlockMetaDataInfo(..) and 
updateBlock(..) were replaced by initReplicaRecovery(..) and 
updateReplicaUnderRecovery(..), respectively.  The stop writer logic was moved 
from updateBlock(..) to initReplicaRecovery(..) so that the writer is stopped 
before getting the replica length.  So this problem disappears in the new codes.

I suggest that we add more validation to check the replica and its file 
lengths, more specifically, to check
- file length and replica's visible length after stopping a writer, and
- replica's original visible length obtained from initReplicaRecovery(..) and 
its current visible length in updateReplicaUnderRecovery(..).

> In Datanode, update block may fail due to length inconsistency
> --------------------------------------------------------------
>
>                 Key: HDFS-29
>                 URL: https://issues.apache.org/jira/browse/HDFS-29
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Tsz Wo (Nicholas), SZE
>
> When a primary datanode tries to recover a block.  It calls 
> getBlockMetaDataInfo(..) to obtains information like block length from each 
> datanode.  Then, it calls updateBlock(..).
> The block length returned in getBlockMetaDataInfo(..) may be obtained from a 
> unclosed local block file F.   However, in updateBlock(..), it first closes F 
> (if F is open) and then gets the length.  These two lengths may be different. 
>  In such case, updateBlock(..) throws an exception.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (HDFS-29) In Datanode, update block may fail due to length inconsistency

Reply via email to