[ 
https://issues.apache.org/jira/browse/HADOOP-6054?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Ravi Gummadi updated HADOOP-6054:
---------------------------------

    Attachment: d_verify.patch

Attaching patch that makes distcp validate copy of files within map task just 
after copy of each file. Validation is done by comparing file sizes and 
checksums, if both the file systems support checksums. In case of validation 
failure, distcp retries to copy the file again within the same map task(max 
number of tries can be configured using distcp.file.retries(with default value 
of 3)).

Please review and provide your comments.

> distcp should validate the data copied
> --------------------------------------
>
>                 Key: HADOOP-6054
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6054
>             Project: Hadoop Core
>          Issue Type: New Feature
>          Components: tools/distcp
>    Affects Versions: 0.21.0
>            Reporter: Ravi Gummadi
>            Assignee: Ravi Gummadi
>             Fix For: 0.21.0
>
>         Attachments: d_verify.patch
>
>
> distcp should validate the files copied by checking the checksums, if the 
> filesystem supports checksums.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to