[jira] [Updated] (HADOOP-16083) DistCp shouldn't always overwrite the target file when checksums match
[ https://issues.apache.org/jira/browse/HADOOP-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HADOOP-16083: Target Version/s: (was: 3.3.1) > DistCp shouldn't always overwrite the target file when checksums match > -- > > Key: HADOOP-16083 > URL: https://issues.apache.org/jira/browse/HADOOP-16083 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 3.1.1, 3.3.0 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HADOOP-16083.001.patch > > > {code:java|title=CopyMapper#setup} > ... > try { > overWrite = overWrite || > targetFS.getFileStatus(targetFinalPath).isFile(); > } catch (FileNotFoundException ignored) { > } > ... > {code} > The above code overrides config key "overWrite" to "true" when the target > path is a file. Therefore, unnecessary transfer happens when the source and > target file have the same checksums. > My suggestion is: remove the code above. If the user insists to overwrite, > just add -overwrite in the options: > {code:bash|title=DistCp command with -overwrite option} > hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt > hdfs://localhost:64464/target/5/6.txt > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16083) DistCp shouldn't always overwrite the target file when checksums match
[ https://issues.apache.org/jira/browse/HADOOP-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-16083: - ===Bulk update=== I am planning to cut the branch for Hadoop 3.3.1 release, and this jira targets 3.3.1 currently. Please take the time to review the patch, or push out of 3.3.1 if you think it can't be finished in the next few weeks. > DistCp shouldn't always overwrite the target file when checksums match > -- > > Key: HADOOP-16083 > URL: https://issues.apache.org/jira/browse/HADOOP-16083 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 3.1.1, 3.3.0 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HADOOP-16083.001.patch > > > {code:java|title=CopyMapper#setup} > ... > try { > overWrite = overWrite || > targetFS.getFileStatus(targetFinalPath).isFile(); > } catch (FileNotFoundException ignored) { > } > ... > {code} > The above code overrides config key "overWrite" to "true" when the target > path is a file. Therefore, unnecessary transfer happens when the source and > target file have the same checksums. > My suggestion is: remove the code above. If the user insists to overwrite, > just add -overwrite in the options: > {code:bash|title=DistCp command with -overwrite option} > hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt > hdfs://localhost:64464/target/5/6.txt > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16083) DistCp shouldn't always overwrite the target file when checksums match
[ https://issues.apache.org/jira/browse/HADOOP-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Steve Loughran updated HADOOP-16083: Target Version/s: 3.3.1 (was: 3.2.3) > DistCp shouldn't always overwrite the target file when checksums match > -- > > Key: HADOOP-16083 > URL: https://issues.apache.org/jira/browse/HADOOP-16083 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 3.1.1, 3.3.0 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HADOOP-16083.001.patch > > > {code:java|title=CopyMapper#setup} > ... > try { > overWrite = overWrite || > targetFS.getFileStatus(targetFinalPath).isFile(); > } catch (FileNotFoundException ignored) { > } > ... > {code} > The above code overrides config key "overWrite" to "true" when the target > path is a file. Therefore, unnecessary transfer happens when the source and > target file have the same checksums. > My suggestion is: remove the code above. If the user insists to overwrite, > just add -overwrite in the options: > {code:bash|title=DistCp command with -overwrite option} > hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt > hdfs://localhost:64464/target/5/6.txt > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16083) DistCp shouldn't always overwrite the target file when checksums match
[ https://issues.apache.org/jira/browse/HADOOP-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Xiaoqiao He updated HADOOP-16083: - Target Version/s: 3.2.3 (was: 3.2.2) Updated the target version to 3.2.3 for preparing 3.2.2 release. Please let me know if it is blocker for you. Thanks. > DistCp shouldn't always overwrite the target file when checksums match > -- > > Key: HADOOP-16083 > URL: https://issues.apache.org/jira/browse/HADOOP-16083 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 3.1.1, 3.3.0 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HADOOP-16083.001.patch > > > {code:java|title=CopyMapper#setup} > ... > try { > overWrite = overWrite || > targetFS.getFileStatus(targetFinalPath).isFile(); > } catch (FileNotFoundException ignored) { > } > ... > {code} > The above code overrides config key "overWrite" to "true" when the target > path is a file. Therefore, unnecessary transfer happens when the source and > target file have the same checksums. > My suggestion is: remove the code above. If the user insists to overwrite, > just add -overwrite in the options: > {code:bash|title=DistCp command with -overwrite option} > hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt > hdfs://localhost:64464/target/5/6.txt > {code} -- This message was sent by Atlassian Jira (v8.3.4#803005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16083) DistCp shouldn't always overwrite the target file when checksums match
[ https://issues.apache.org/jira/browse/HADOOP-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Zhankun Tang updated HADOOP-16083: -- Target Version/s: 3.3.0, 3.2.1, 3.1.4 (was: 3.3.0, 3.2.1, 3.1.3) > DistCp shouldn't always overwrite the target file when checksums match > -- > > Key: HADOOP-16083 > URL: https://issues.apache.org/jira/browse/HADOOP-16083 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 3.1.1, 3.3.0 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HADOOP-16083.001.patch > > > {code:java|title=CopyMapper#setup} > ... > try { > overWrite = overWrite || > targetFS.getFileStatus(targetFinalPath).isFile(); > } catch (FileNotFoundException ignored) { > } > ... > {code} > The above code overrides config key "overWrite" to "true" when the target > path is a file. Therefore, unnecessary transfer happens when the source and > target file have the same checksums. > My suggestion is: remove the code above. If the user insists to overwrite, > just add -overwrite in the options: > {code:bash|title=DistCp command with -overwrite option} > hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt > hdfs://localhost:64464/target/5/6.txt > {code} -- This message was sent by Atlassian Jira (v8.3.2#803003) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16083) DistCp shouldn't always overwrite the target file when checksums match
[ https://issues.apache.org/jira/browse/HADOOP-16083?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Siyao Meng updated HADOOP-16083: Attachment: HADOOP-16083.001.patch Status: Patch Available (was: Open) > DistCp shouldn't always overwrite the target file when checksums match > -- > > Key: HADOOP-16083 > URL: https://issues.apache.org/jira/browse/HADOOP-16083 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.1.1, 3.2.0, 3.3.0 >Reporter: Siyao Meng >Assignee: Siyao Meng >Priority: Major > Attachments: HADOOP-16083.001.patch > > > {code:java|title=CopyMapper#setup} > ... > try { > overWrite = overWrite || > targetFS.getFileStatus(targetFinalPath).isFile(); > } catch (FileNotFoundException ignored) { > } > ... > {code} > The above code overrides config key "overWrite" to "true" when the target > path is a file. Therefore, unnecessary transfer happens when the source and > target file have the same checksums. > My suggestion is: remove the code above. If the user insists to overwrite, > just add -overwrite in the options: > {code:bash|title=DistCp command with -overwrite option} > hadoop distcp -overwrite hdfs://localhost:64464/source/5/6.txt > hdfs://localhost:64464/target/5/6.txt > {code} -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org