[ https://issues.apache.org/jira/browse/HADOOP-1557?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12509949 ]
Doug Cutting commented on HADOOP-1557: -------------------------------------- > when a setReplication() command is sent to the NameNode, no data blocks are > being read Right. But a setReplication() triggers replications, and, when those replications happen, the data is read. If, when writing the replica, the checksum of the received data does not match the checksum sent with that data, the receiving datanode should report to the namenode that the data was corrupt and abort the replication. This would cause the source block to be removed (provided there are more replicas) and the namenode to initiate new replications from a different source. After HADOOP-1134, datanodes should always validate checksums as blocks are written. Whenever there's a mismatch, the write should be aborted. If the write is a replication (as opposed to an initial write) the mismatch should be reported to the namenode. Does that sound like the right policy to you? > Deletion of excess replicas should prefer to delete corrupted replicas before > deleting valid replicas > ----------------------------------------------------------------------------------------------------- > > Key: HADOOP-1557 > URL: https://issues.apache.org/jira/browse/HADOOP-1557 > Project: Hadoop > Issue Type: Bug > Components: dfs > Reporter: dhruba borthakur > > Suppose a block has three replicas and two of the replicas are corrupted. If > the replication factor of the file is reduced to 2. The filesystem should > preferably delete the two corrupted replicas, otherwise it could lead to a > corrupted file. > One option would be to make the datanode periodically validate all blocks > with their corresponding CRCs. The other option would be to make the > setReplication call validate existing replicas before deleting excess > replicas. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.