[
https://issues.apache.org/jira/browse/HADOOP-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Wei-Chiu Chuang updated HADOOP-16083:
-------------------------------------
===Bulk update===
I am planning to cut the branch for Hadoop 3.3.1 release, and this jira targets
3.3.1 currently. Please take the time to review the patch, or push out of 3.3.1
if you think it can't be finished in the next few weeks.
> DistCp shouldn't always overwrite the target file when checksums match
> ----------------------------------------------------------------------
>
> Key: HADOOP-16083
> URL: https://issues.apache.org/jira/browse/HADOOP-16083
> Project: Hadoop Common
> Issue Type: Improvement
> Components: tools/distcp
> Affects Versions: 3.2.0, 3.1.1, 3.3.0
> Reporter: Siyao Meng
> Assignee: Siyao Meng
> Priority: Major
> Attachments: HADOOP-16083.001.patch
>
>
> {code:java|title=CopyMapper#setup}
> ...
> try {
> overWrite = overWrite ||
> targetFS.getFileStatus(targetFinalPath).isFile();
> } catch (FileNotFoundException ignored) {
> }
> ...
> {code}
> The above code overrides config key "overWrite" to "true" when the target
> path is a file. Therefore, unnecessary transfer happens when the source and
> target file have the same checksums.
> My suggestion is: remove the code above. If the user insists to overwrite,
> just add -overwrite in the options:
> {code:bash|title=DistCp command with -overwrite option}
> hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt
> hdfs://localhost:64464/target/5/6.txt
> {code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]