[ https://issues.apache.org/jira/browse/HADOOP-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721344#action_12721344 ]
Ravi Gummadi commented on HADOOP-6051: -------------------------------------- Currently -update writes to bar only and I think that is correct. It copies to bar/foo only if bar is a dir and existing(similar to what happens without -update). If "bar" doesn't exist at destination, then foo is copied to bar. If "bar" exists at destination and is a file, it is overwritten if different from the source(this is the case overwriting is happening again and again, though it should not). I don't see any path difference with -update when compared to without -update in any case(whether the destination exists or not). Am I missing any case where -update writes to a different path when compared to without -update option ? > distcp does not skip copying file if we are updating single file > ---------------------------------------------------------------- > > Key: HADOOP-6051 > URL: https://issues.apache.org/jira/browse/HADOOP-6051 > Project: Hadoop Core > Issue Type: Bug > Components: tools/distcp > Affects Versions: 0.21.0 > Reporter: Ravi Gummadi > Fix For: 0.21.0 > > > distcp doesn't skip copying file when we do -update on single file if the > destfile already exists. > When we do > hadoop distcp -update srcfilename destfilename > it seems to be comparing checksums of srcfilename and > destfilename/srcfilename and so skip is not done. It should compare checksums > of srcfilename and destfilename. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.