[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-16158: - Resolution: Fixed Fix Version/s: 3.1.3 3.2.1 3.3.0 Status: Resolved (was: Patch Available) Pushed to trunk branch-3.2 without conflcits. There is a trivial import conflict in branch-3.1. I resolved it and pushed the commit to branch-3.1. [^HADOOP-16158.branch-3.1.patch] for reference. > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > Fix For: 3.3.0, 3.2.1, 3.1.3 > > Attachments: HADOOP-16158.branch-3.1.patch > > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Wei-Chiu Chuang updated HADOOP-16158: - Attachment: HADOOP-16158.branch-3.1.patch > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > Attachments: HADOOP-16158.branch-3.1.patch > > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Attachment: (was: HADOOP-16158-002.patch) > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Attachment: (was: HADOOP-16158-001.patch) > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Attachment: (was: HADOOP-16158-005.patch) > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Attachment: (was: HADOOP-16158-004.patch) > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Attachment: (was: HADOOP-16158-003.patch) > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.14#76016) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Attachment: HADOOP-16158-005.patch > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > Attachments: HADOOP-16158-001.patch, HADOOP-16158-002.patch, > HADOOP-16158-003.patch, HADOOP-16158-004.patch, HADOOP-16158-005.patch > > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Status: Patch Available (was: Open) > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.1.2, 3.0.3, 2.9.2, 3.2.0 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > Attachments: HADOOP-16158-001.patch, HADOOP-16158-002.patch, > HADOOP-16158-003.patch, HADOOP-16158-004.patch, HADOOP-16158-005.patch > > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Status: Open (was: Patch Available) > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.1.2, 3.0.3, 2.9.2, 3.2.0 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > Attachments: HADOOP-16158-001.patch, HADOOP-16158-002.patch, > HADOOP-16158-003.patch, HADOOP-16158-004.patch, HADOOP-16158-005.patch > > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org
[jira] [Updated] (HADOOP-16158) DistCp to support checksum validation when copy blocks in parallel
[ https://issues.apache.org/jira/browse/HADOOP-16158?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ] Kai Xie updated HADOOP-16158: - Summary: DistCp to support checksum validation when copy blocks in parallel (was: DistCp to supports checksum validation when copy blocks in parallel) > DistCp to support checksum validation when copy blocks in parallel > -- > > Key: HADOOP-16158 > URL: https://issues.apache.org/jira/browse/HADOOP-16158 > Project: Hadoop Common > Issue Type: Improvement > Components: tools/distcp >Affects Versions: 3.2.0, 2.9.2, 3.0.3, 3.1.2 >Reporter: Kai Xie >Assignee: Kai Xie >Priority: Major > Attachments: HADOOP-16158-001.patch, HADOOP-16158-002.patch, > HADOOP-16158-003.patch, HADOOP-16158-004.patch > > > Copying blocks in parallel (enabled when blocks per chunk > 0) is a great > DistCp improvement that can hugely speed up copying big files. > But its checksum validation is skipped, e.g. in > `RetriableFileCopyCommand.java` > > {code:java} > if (!source.isSplit()) { > compareCheckSums(sourceFS, source.getPath(), sourceChecksum, > targetFS, targetPath); > } > {code} > and this could result in checksum/data mismatch without notifying > developers/users (e.g. HADOOP-16049). > I'd like to provide a patch to add the checksum validation. > -- This message was sent by Atlassian JIRA (v7.6.3#76005) - To unsubscribe, e-mail: common-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: common-issues-h...@hadoop.apache.org