[
https://issues.apache.org/jira/browse/HADOOP-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Ravi Gummadi updated HADOOP-6054:
---------------------------------
Attachment: d_verify.patch
Attaching patch that makes distcp validate copy of files within map task just
after copy of each file. Validation is done by comparing file sizes and
checksums, if both the file systems support checksums. In case of validation
failure, distcp retries to copy the file again within the same map task(max
number of tries can be configured using distcp.file.retries(with default value
of 3)).
Please review and provide your comments.
> distcp should validate the data copied
> --------------------------------------
>
> Key: HADOOP-6054
> URL: https://issues.apache.org/jira/browse/HADOOP-6054
> Project: Hadoop Core
> Issue Type: New Feature
> Components: tools/distcp
> Affects Versions: 0.21.0
> Reporter: Ravi Gummadi
> Assignee: Ravi Gummadi
> Fix For: 0.21.0
>
> Attachments: d_verify.patch
>
>
> distcp should validate the files copied by checking the checksums, if the
> filesystem supports checksums.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.