[
https://issues.apache.org/jira/browse/HADOOP-16776?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17039267#comment-17039267
]
Daryn Sharp commented on HADOOP-16776:
--------------------------------------
HADOOP-16775 (this jira's is a back port) does not clearly explain the
severity: *distcp copies to s3 will be randomly corrupted*. Basically every
file other than the first file copied by each task has the risk of being a dup
of a previously copied file by that task. It happens surprisingly often and
the job _does not fail_.
[[email protected]], please explain this circular logic:
bq. I don't Think back reporting is this is justified. It's just a safety
measure for people who aren't using -direct
Ok, sounds great, but you blocked adding the -direct flag in HADOOP-15281:
bq. Closing as fixed. I'm not going apply the -direct option to branch-2: if
you want to work with cloud stores, run, don't walk to branch-3
So I can't have the fix and I can't have the -direct workaround...
I'm appalled and dismayed. You're blocking fixes for a critical data
corruption bug due to a personal interest in advancing branch-3? We've been
telling customers for months that it was impossible for distcp to copy the
wrong data and they must be overwriting the s3 destination.
> backport HADOOP-16775 (distcp unique files) to branch-2
> -------------------------------------------------------
>
> Key: HADOOP-16776
> URL: https://issues.apache.org/jira/browse/HADOOP-16776
> Project: Hadoop Common
> Issue Type: Improvement
> Components: tools/distcp
> Affects Versions: 2.8.0, 3.0.0
> Reporter: Amir Shenavandeh
> Priority: Major
> Labels: DistCp
> Attachments: HADOOP-16776-branch-2.8-001.patch,
> HADOOP-16776-branch-2.8-002.patch
>
>
> This is to back port HADOOP-16775 to hadoop 2.8 branch.
>
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]