[
https://issues.apache.org/jira/browse/HDFS-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885079#action_12885079
]
Steve Loughran commented on HDFS-125:
-------------------------------------
Is this risk still present?
I could imagine two ways of checking
* checksum validator fetches the checksum from another DN instead of the one
whose block is being read.
* something gets the checksums from all DNs with the data, and looks for
inconsistencies.
Multiple blocks with different declared checksums is going to be a manual
intervention kind of problem. Still, it is better to know when its arisen
> Consistency of different replicas of the same block is not checked.
> -------------------------------------------------------------------
>
> Key: HDFS-125
> URL: https://issues.apache.org/jira/browse/HDFS-125
> Project: Hadoop HDFS
> Issue Type: Bug
> Reporter: Konstantin Shvachko
>
> HDFS currently detects corrupted replicas by verifying that its contents
> matches the checksum stored in the block meta-file. This is done
> independently for each replica of the block on the data-node it belongs to.
> But we do not check that the replicas are identical across data-nodes as long
> as they have the same size.
> This is not common but can happen as a result of a software bug or an
> operator mismanagement. And in this case different clients will read
> different data from the same file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.