Mavin Martin created HADOOP-13024:
-------------------------------------

             Summary: Distcp with -delete feature on raw data not implemented
                 Key: HADOOP-13024
                 URL: https://issues.apache.org/jira/browse/HADOOP-13024
             Project: Hadoop Common
          Issue Type: Bug
    Affects Versions: 2.6.0
            Reporter: Mavin Martin


When doing distcp of raw data using -delete feature, following bug appears.
{code}
[root@xxx bin]# hadoop distcp -delete -update /.reserved/raw/tmp/a 
/.reserved/raw/tmp/b
16/04/14 02:54:01 ERROR tools.DistCp: Exception encountered
java.io.IOException: DistCp failure: Job job_xxx has failed: Job commit failed: 
org.apache.hadoop.tools.CopyListing$InvalidInputException: The source path 
'hdfs://nn/.reserved/raw/tmp/b' starts with /.reserved/raw but the target path 
'hdfs://nn/NONE' does not. Either all or none of the paths must have this 
prefix.
        at 
org.apache.hadoop.tools.SimpleCopyListing.validatePaths(SimpleCopyListing.java:141)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:85)
        at 
org.apache.hadoop.tools.GlobbedCopyListing.doBuildListing(GlobbedCopyListing.java:90)
        at org.apache.hadoop.tools.CopyListing.buildListing(CopyListing.java:86)
        at 
org.apache.hadoop.tools.mapred.CopyCommitter.deleteMissing(CopyCommitter.java:244)
        at 
org.apache.hadoop.tools.mapred.CopyCommitter.commitJob(CopyCommitter.java:94)
        at 
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.handleJobCommit(CommitterEventHandler.java:274)
        at 
org.apache.hadoop.mapreduce.v2.app.commit.CommitterEventHandler$EventProcessor.run(CommitterEventHandler.java:237)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)


        at org.apache.hadoop.tools.DistCp.execute(DistCp.java:187)
        at org.apache.hadoop.tools.DistCp.run(DistCp.java:122)
        at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:70)
        at org.apache.hadoop.tools.DistCp.main(DistCp.java:429)
{code}

The issue is not with the distributed copy, the issue is when it tries to 
delete things in the target that no longer exist in the source, it revalidates 
to make sure NONE is in the /.reserved/raw domain.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to