[
https://issues.apache.org/jira/browse/HADOOP-15300?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17113163#comment-17113163
]
Steve Loughran commented on HADOOP-15300:
-----------------------------------------
duplicate of HADOOP-16756; fixed by rollback of HADOOP-8143
> distcp -update to WASB and ADL copies up all the files, always
> --------------------------------------------------------------
>
> Key: HADOOP-15300
> URL: https://issues.apache.org/jira/browse/HADOOP-15300
> Project: Hadoop Common
> Issue Type: Bug
> Components: fs/adl, fs/azure
> Affects Versions: 3.1.0
> Reporter: Steve Loughran
> Priority: Major
>
> If you use {{distcp -update}} to an adl or wasb store, repeatedly, all the
> source files are copied up every time. In contrast, if you use hdfs:// or
> s3a:// as a destination, only the new ones are uploaded. hdfs uses checksums
> for a diff, but s3a is just returning file length and relying on distcp logic
> being "if either src or dest doesn't do checksums, only compare file len"
> somehow that's not kicking in. Tested for file: and hdfs sources, wasb and
> adl dests
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]