[ https://issues.apache.org/jira/browse/HADOOP-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Ravi Gummadi updated HADOOP-6054: --------------------------------- Attachment: d_verify.patch Attaching patch that makes distcp validate copy of files within map task just after copy of each file. Validation is done by comparing file sizes and checksums, if both the file systems support checksums. In case of validation failure, distcp retries to copy the file again within the same map task(max number of tries can be configured using distcp.file.retries(with default value of 3)). Please review and provide your comments. > distcp should validate the data copied > -------------------------------------- > > Key: HADOOP-6054 > URL: https://issues.apache.org/jira/browse/HADOOP-6054 > Project: Hadoop Core > Issue Type: New Feature > Components: tools/distcp > Affects Versions: 0.21.0 > Reporter: Ravi Gummadi > Assignee: Ravi Gummadi > Fix For: 0.21.0 > > Attachments: d_verify.patch > > > distcp should validate the files copied by checking the checksums, if the > filesystem supports checksums. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.