Steve Loughran created HADOOP-15300:
---------------------------------------
Summary: distcp -update to WASB and ADL copies up all the files,
always
Key: HADOOP-15300
URL: https://issues.apache.org/jira/browse/HADOOP-15300
Project: Hadoop Common
Issue Type: Bug
Components: fs/adl, fs/azure
Affects Versions: 3.1.0
Reporter: Steve Loughran
If you use {{distcp -update}} to an adl or wasb store, repeatedly, all the
source files are copied up every time. In contrast, if you use hdfs:// or
s3a:// as a destination, only the new ones are uploaded. hdfs uses checksums
for a diff, but s3a is just returning file length and relying on distcp logic
being "if either src or dest doesn't do checksums, only compare file len"
somehow that's not kicking in. Tested for file: and hdfs sources, wasb and adl
dests
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]