Steve Loughran created HADOOP-15300:
---------------------------------------

             Summary: distcp -update to WASB and ADL copies up all the files, 
always
                 Key: HADOOP-15300
                 URL: https://issues.apache.org/jira/browse/HADOOP-15300
             Project: Hadoop Common
          Issue Type: Bug
          Components: fs/adl, fs/azure
    Affects Versions: 3.1.0
            Reporter: Steve Loughran


If you use {{distcp -update}} to an adl or wasb store, repeatedly, all the 
source files are copied up every time. In contrast, if you use hdfs:// or 
s3a:// as a destination, only the new ones are uploaded. hdfs uses checksums 
for a diff, but s3a is just returning file length and relying on distcp logic 
being "if either src or dest doesn't do checksums, only compare file len"

somehow that's not kicking in. Tested for file:  and hdfs sources, wasb and adl 
dests



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: common-issues-h...@hadoop.apache.org

Reply via email to