[
https://issues.apache.org/jira/browse/MAPREDUCE-1231?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12782585#action_12782585
]
Tsz Wo (Nicholas), SZE commented on MAPREDUCE-1231:
---------------------------------------------------
> ... Checking only on the length feels risky to me. ...
Yes, that why -skipcrccheck is an option. Checking mtime definitely is a good
idea but it probably should be done in a separated issue.
> Distcp is very slow
> -------------------
>
> Key: MAPREDUCE-1231
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-1231
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Components: distcp
> Affects Versions: 0.20.1
> Reporter: Jothi Padmanabhan
> Assignee: Jothi Padmanabhan
> Attachments: mapred-1231-v1.patch, mapred-1231-v2.patch,
> mapred-1231-y20-v2.patch, mapred-1231-y20.patch, mapred-1231.patch
>
>
> Currently distcp does a checksums check in addition to file length check to
> decide if a remote file has to be copied. If the number of files is high
> (thousands), this checksum check is proving to be fairly costly leading to a
> long time before the copy is started.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.