[
https://issues.apache.org/jira/browse/HADOOP-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413396#comment-17413396
]
Ayush Saxena commented on HADOOP-14631:
---------------------------------------
{quote}When atomic is set and {{AtomicWorkPath}} == null, distcp will get the
parent of current {{WorkDir}}. In this case, if {{workdir}} is {{"/"}}, the
parent will be {{null}}, wich means
{{workDir = new Path(workDir, WIP_PREFIX + targetPath.getName() +
rand.nextInt());}} will throw a nullpoint exception.
{quote}
The NPE, that is being talked about here and in HADOOP-14567, will only trigger
in case the {{targetPath}} is root, I wrote a Test in
{{AbstractContractDistCpTest}} and it does reproduce the said NPE:
{code:java}
@Test
public void testDistCpAtomic() throws Exception {
Path source = new Path(remoteDir, "src");
remoteFS.mkdirs(source);
Path dest = new Path(localFS.getUri().toString()+"/");
String options = "-atomic" + getDefaultCLIOptions();
DistCpTestUtils.assertRunDistCp(DistCpConstants.SUCCESS, source.toString(),
dest.toString(), options, conf);
}
{code}
The catch is if we fix the NPE, still atomic doesn't work with target as /, It
lands up throwing an exception:
{code:java}
org.apache.hadoop.tools.CopyListing$InvalidInputException: Target path for
atomic-commit already exists: file:/. Cannot atomic-commit to pre-existing
target-path.
at
org.apache.hadoop.tools.SimpleCopyListing.validatePaths(SimpleCopyListing.java:163)
at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:85)
{code}
Reason being, for atomic to work, the {{targetPath}} shouldn't exist, but {{/}}
I expect is going to exist always, At most we can have a validation to prevent
NPE and make it fail at {{Target path for atomic-commit already exists:}}
> Distcp should add a default atomicWorkPath properties when using atomic
> ------------------------------------------------------------------------
>
> Key: HADOOP-14631
> URL: https://issues.apache.org/jira/browse/HADOOP-14631
> Project: Hadoop Common
> Issue Type: Bug
> Affects Versions: 2.7.3, 3.0.0-alpha3
> Reporter: Hongyuan Li
> Assignee: Hongyuan Li
> Priority: Major
> Fix For: 2.7.3
>
>
> Distcp should add a default AtomicWorkPath properties when using atomic
> {{Distcp}}#{{configureOutputFormat}} using code below to generate atomic work
> path,
> {code}
> if (context.shouldAtomicCommit()) {
> Path workDir = context.getAtomicWorkPath();
> if (workDir == null) {
> workDir = targetPath.getParent();
> }
> workDir = new Path(workDir, WIP_PREFIX + targetPath.getName()
> + rand.nextInt());
> {code}
> When atomic is set and {{AtomicWorkPath}} == null, distcp will get the parent
> of current {{WorkDir}}. In this case, if {{workdir}} is {{"/"}}, the parent
> will be {{null}}, wich means
> {{workDir = new Path(workDir, WIP_PREFIX + targetPath.getName() +
> rand.nextInt());}} will throw a nullpoint exception.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]