Joep Rottinghuis created HADOOP-13002:
-----------------------------------------
Summary: distcp behaves differently through code compared to
toolrunner invocation from command-line
Key: HADOOP-13002
URL: https://issues.apache.org/jira/browse/HADOOP-13002
Project: Hadoop Common
Issue Type: Bug
Components: tools/distcp
Affects Versions: 2.7.0, 2.6.0, 2.5.0, 3.0.0
Reporter: Joep Rottinghuis
In Hadoop 2.5 the behavior of distcp changed when called through code iff the
target directory did not exist and update wasn't used and atomic wasn't used.
HADOOP-10459 introduced a change to preserve the root directory attributes. It
introduced a derivative property in the options as well as in the configuration
whether the target path exists. See
https://github.com/apache/hadoop/commit/c5b59477775c797944db4992e8a70289ba2895ed
However, this property is set only when distcp is used through the command line
as a ToolRunner in Distcp.run(String[] argv).
The result is that when the target directory doesn't exist (and neither -update
nor -atomic options are used) SimplyCopyListing incorrectly assumes that the
target directory does exist because the attribute defaults to true. Copying
directory a/b/c to xyz results in the creation of a xyx/c directory with the
content of c in it, rather than the content of c getting copied into directory
xyz directly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)