[
https://issues.apache.org/jira/browse/HDFS-29?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12764893#action_12764893
]
Tsz Wo (Nicholas), SZE commented on HDFS-29:
--------------------------------------------
After a series of HDFS-265 sub-task patches, getBlockMetaDataInfo(..) and
updateBlock(..) were replaced by initReplicaRecovery(..) and
updateReplicaUnderRecovery(..), respectively. The stop writer logic was moved
from updateBlock(..) to initReplicaRecovery(..) so that the writer is stopped
before getting the replica length. So this problem disappears in the new codes.
I suggest that we add more validation to check the replica and its file
lengths, more specifically, to check
- file length and replica's visible length after stopping a writer, and
- replica's original visible length obtained from initReplicaRecovery(..) and
its current visible length in updateReplicaUnderRecovery(..).
> In Datanode, update block may fail due to length inconsistency
> --------------------------------------------------------------
>
> Key: HDFS-29
> URL: https://issues.apache.org/jira/browse/HDFS-29
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Tsz Wo (Nicholas), SZE
>
> When a primary datanode tries to recover a block. It calls
> getBlockMetaDataInfo(..) to obtains information like block length from each
> datanode. Then, it calls updateBlock(..).
> The block length returned in getBlockMetaDataInfo(..) may be obtained from a
> unclosed local block file F. However, in updateBlock(..), it first closes F
> (if F is open) and then gets the length. These two lengths may be different.
> In such case, updateBlock(..) throws an exception.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.