[ 
https://issues.apache.org/jira/browse/HADOOP-14631?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17413396#comment-17413396
 ] 

Ayush Saxena commented on HADOOP-14631:
---------------------------------------

{quote}When atomic is set and {{AtomicWorkPath}} == null, distcp will get the 
parent of current {{WorkDir}}. In this case, if {{workdir}} is {{"/"}}, the 
parent will be {{null}}, wich means
 {{workDir = new Path(workDir, WIP_PREFIX + targetPath.getName() + 
rand.nextInt());}} will throw a nullpoint exception.
{quote}
The NPE, that is being talked about here and in HADOOP-14567, will only trigger 
in case the {{targetPath}} is root, I wrote a Test in 
{{AbstractContractDistCpTest}} and it does reproduce the said NPE:
{code:java}
  @Test
  public void testDistCpAtomic() throws Exception {
    Path source = new Path(remoteDir, "src");
    remoteFS.mkdirs(source);
    Path dest = new Path(localFS.getUri().toString()+"/");
    String options = "-atomic" + getDefaultCLIOptions();
    DistCpTestUtils.assertRunDistCp(DistCpConstants.SUCCESS, source.toString(),
        dest.toString(), options, conf);
  }
{code}
The catch is if we fix the NPE, still atomic doesn't work with target as /, It 
lands up throwing an exception:
{code:java}
org.apache.hadoop.tools.CopyListing$InvalidInputException: Target path for 
atomic-commit already exists: file:/. Cannot atomic-commit to pre-existing 
target-path.
        at 
org.apache.hadoop.tools.SimpleCopyListing.validatePaths(SimpleCopyListing.java:163)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:85)
{code}
Reason being, for atomic to work, the {{targetPath}} shouldn't exist, but {{/}} 
I expect is going to exist always, At most we can have a validation to prevent 
NPE and make it fail at {{Target path for atomic-commit already exists:}}

> Distcp should add a default  atomicWorkPath properties when using atomic
> ------------------------------------------------------------------------
>
>                 Key: HADOOP-14631
>                 URL: https://issues.apache.org/jira/browse/HADOOP-14631
>             Project: Hadoop Common
>          Issue Type: Bug
>    Affects Versions: 2.7.3, 3.0.0-alpha3
>            Reporter: Hongyuan Li
>            Assignee: Hongyuan Li
>            Priority: Major
>             Fix For: 2.7.3
>
>
> Distcp should add a default  AtomicWorkPath properties when using atomic
> {{Distcp}}#{{configureOutputFormat}} using code below to generate atomic work 
> path,
> {code}
>     if (context.shouldAtomicCommit()) {
>       Path workDir = context.getAtomicWorkPath();
>       if (workDir == null) {
>         workDir = targetPath.getParent();
>       }
>       workDir = new Path(workDir, WIP_PREFIX + targetPath.getName()
>                                 + rand.nextInt());
> {code}
> When atomic is set and {{AtomicWorkPath}} == null, distcp will get the parent 
> of current {{WorkDir}}. In this case, if {{workdir}} is {{"/"}}, the parent 
> will be {{null}}, wich means 
> {{workDir = new Path(workDir, WIP_PREFIX + targetPath.getName() + 
> rand.nextInt());}} will throw a nullpoint exception.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to