Steve Loughran created HADOOP-15300: ---------------------------------------
Summary: distcp -update to WASB and ADL copies up all the files, always Key: HADOOP-15300 URL: https://issues.apache.org/jira/browse/HADOOP-15300 Project: Hadoop Common Issue Type: Bug Components: fs/adl, fs/azure Affects Versions: 3.1.0 Reporter: Steve Loughran If you use {{distcp -update}} to an adl or wasb store, repeatedly, all the source files are copied up every time. In contrast, if you use hdfs:// or s3a:// as a destination, only the new ones are uploaded. hdfs uses checksums for a diff, but s3a is just returning file length and relying on distcp logic being "if either src or dest doesn't do checksums, only compare file len" somehow that's not kicking in. Tested for file: and hdfs sources, wasb and adl dests -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org