[ 
https://issues.apache.org/jira/browse/HDFS-125?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12885079#action_12885079
 ] 

Steve Loughran commented on HDFS-125:
-------------------------------------

Is this risk still present? 

I could imagine two ways of checking

* checksum validator fetches the checksum from another DN instead of the one 
whose block is being read.
* something gets the checksums from all DNs with the data, and looks for 
inconsistencies. 

Multiple blocks with different declared checksums is going to be a manual 
intervention kind of problem. Still, it is better to know when its arisen

> Consistency of different replicas of the same block is not checked.
> -------------------------------------------------------------------
>
>                 Key: HDFS-125
>                 URL: https://issues.apache.org/jira/browse/HDFS-125
>             Project: Hadoop HDFS
>          Issue Type: Bug
>            Reporter: Konstantin Shvachko
>
> HDFS currently detects corrupted replicas by verifying that its contents 
> matches the checksum stored in the block meta-file. This is done 
> independently for each replica of the block on the data-node it belongs to. 
> But we do not check that the replicas are identical across data-nodes as long 
> as they have the same size.
> This is not common but can happen as a result of a software bug or an 
> operator mismanagement. And in this case different clients will read 
> different data from the same file.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to