[ 
https://issues.apache.org/jira/browse/HADOOP-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16761161#comment-16761161
 ] 

Steve Loughran commented on HADOOP-16083:
-----------------------------------------

I understand. It's the special case of single file copy.  It's probably not 
surfaced that much because either (a) it doesn't get used much or (b) its been 
for small files and nobody noticed. And yes, looks like a bug to me.

I think for the single file copy we just need to make sure that

in a filesystem with checksums, the copy doesn't take place if the checksums 
match
and it does still take place if the checksums don't match, or if -skipCrcCheck 
is set


> DistCp shouldn't always overwrite the target file when checksums match
> ----------------------------------------------------------------------
>
>                 Key: HADOOP-16083
>                 URL: https://issues.apache.org/jira/browse/HADOOP-16083
>             Project: Hadoop Common
>          Issue Type: Improvement
>          Components: tools/distcp
>    Affects Versions: 3.2.0, 3.1.1, 3.3.0
>            Reporter: Siyao Meng
>            Assignee: Siyao Meng
>            Priority: Major
>         Attachments: HADOOP-16083.001.patch
>
>
> {code:java|title=CopyMapper#setup}
> ...
>     try {
>       overWrite = overWrite || 
> targetFS.getFileStatus(targetFinalPath).isFile();
>     } catch (FileNotFoundException ignored) {
>     }
> ...
> {code}
> The above code overrides config key "overWrite" to "true" when the target 
> path is a file. Therefore, unnecessary transfer happens when the source and 
> target file have the same checksums.
> My suggestion is: remove the code above. If the user insists to overwrite, 
> just add -overwrite in the options:
> {code:bash|title=DistCp command with -overwrite option}
> hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt 
> hdfs://localhost:64464/target/5/6.txt
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to