[ 
https://issues.apache.org/jira/browse/HADOOP-6051?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12721060#action_12721060
 ] 

Ravi Gummadi commented on HADOOP-6051:
--------------------------------------

Only file sizes were checked earlier. But now in trunk, checksums are also 
checked after checking filesizes.
In any case, if I run the following command multiple times

hadoop distcp -update srcfile destfile

and if destfile doesn't exist, -update should allow the file to be copied only 
once and from 2nd run onwards it should not copy as the filesizes(and 
checksums are same).
But the problem here seems to be it is not comparing the filesizes and 
checksums of srcfile and destfile. distcp seems to be comparing srcfile with  
the path destfile/srcfile(i.e. srcfile in destfile directory), which is wrong.

> distcp does not skip copying file if we are updating single file
> ----------------------------------------------------------------
>
>                 Key: HADOOP-6051
>                 URL: https://issues.apache.org/jira/browse/HADOOP-6051
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: tools/distcp
>    Affects Versions: 0.21.0
>            Reporter: Ravi Gummadi
>             Fix For: 0.21.0
>
>
> distcp doesn't skip copying file when we do -update on single file if the 
> destfile already exists.
> When we do 
> hadoop distcp -update srcfilename destfilename
> it seems to be comparing checksums of srcfilename and 
> destfilename/srcfilename and so skip is not done. It should compare checksums 
> of srcfilename and destfilename.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to