If I execute 'hadoop distcp hdfs:///tmp/test1.txt
ftp://ftpuser:ftpuser@hostname/tmp/', the exception will be:
attempt_201304222240_0006_m_000000_1: log4j:ERROR Could not connect to
remote log4j server at [localhost]. We will try again later.
13/04/23 19:31:33 INFO mapred.JobClient: Task Id :
attempt_201304222240_0006_m_000000_2, Status : FAILED
java.io.IOException: Cannot rename parent(source):
ftp://ftpuser:ftpuser@hostname/tmp/_distcp_logs_o6gzfy/_temporary/_attempt_201304222240_0006_m_000000_2,
parent(destination):
ftp://ftpuser:ftpu...@bdvm104.svl.ibm.com/tmp/_distcp_logs_o6gzfy
        at
org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:547)
        at
org.apache.hadoop.fs.ftp.FTPFileSystem.rename(FTPFileSystem.java:512)
        at
org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:154)
        at
org.apache.hadoop.mapred.FileOutputCommitter.moveTaskOutputs(FileOutputCommitter.java:172)
        at
org.apache.hadoop.mapred.FileOutputCommitter.commitTask(FileOutputCommitter.java:132)
        at
org.apache.hadoop.mapred.OutputCommitter.commitTask(OutputCommitter.java:221)
        at org.apache.hadoop.mapred.Task.commit(Task.java:1019)
        at org.apache.hadoop.mapred.Task.done(Task.java:889)
        at org.apache.hadoop.mapred.MapTask.run(MapTask.java:373)
        at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
        at
java.security.AccessController.doPrivileged(AccessController.java:310)
        at javax.security.auth.Subject.doAs(Subject.java:573)
        at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
        at org.apache.hadoop.mapred.Child.main(Child.java:249)



2013/4/24 sam liu <samliuhad...@gmail.com>

> Now,  I can successfully run "hadoop distcp 
> ftp://ftpuser:ftpuser@hostname/tmp/test1.txt
> hdfs:///tmp/test1.txt"
>
> But failed on "hadoop distcp hdfs:///tmp/test1.txt
> ftp://ftpuser:ftpuser@hostname/tmp/test1.txt.v1";, it returns issue like:
> attempt_201304222240_0005_m_000000_1: log4j:ERROR Could not connect to
> remote log4j server at [localhost]. We will try again later.
> 13/04/23 18:59:05 INFO mapred.JobClient: Task Id :
> attempt_201304222240_0005_m_000000_2, Status : FAILED
> java.io.IOException: Copied: 0 Skipped: 0 Failed: 1
>         at
> org.apache.hadoop.tools.DistCp$CopyFilesMapper.close(DistCp.java:582)
>         at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:57)
>         at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:435)
>         at org.apache.hadoop.mapred.MapTask.run(MapTask.java:371)
>
>         at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>         at
> java.security.AccessController.doPrivileged(AccessController.java:310)
>         at javax.security.auth.Subject.doAs(Subject.java:573)
>         at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1149)
>         at org.apache.hadoop.mapred.Child.main(Child.java:249)
>
>
> 2013/4/24 sam liu <samliuhad...@gmail.com>
>
>> I can success execute "hadoop fs -ls 
>> ftp://hadoopadm:xxxxxxxx@ftphostname<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here>",
>> it returns the root path of linux system.
>>
>> But failed to execute "hadoop fs -rm
>> ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here";, and it returns:
>> rm: Delete failed 
>> ftp://hadoopadm:xxxxxxxx<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here>
>> @ftphostname/some/path/here<ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here>
>>
>>
>> 2013/4/24 Daryn Sharp <da...@yahoo-inc.com>
>>
>>>  The ftp fs is listing the contents of the given path's parent
>>> directory, and then trying to match the basename of each child path
>>> returned against the basename of the given path – quite inefficient…  The
>>> FNF is it didn't find a match for the basename.  It may be that the ftp
>>> server isn't returning a listing in exactly the expected format so it's
>>> being parsed incorrectly.
>>>
>>>  Does "hadoop fs -ls ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here";
>>> work?  Or "hadoop fs -rm
>>> ftp://hadoopadm:xxxxxxxx@ftphostname/some/path/here";?  Those cmds
>>> should exercise the same code paths where you are experiencing errors.
>>>
>>>  Daryn
>>>
>>>  On Apr 22, 2013, at 9:06 PM, sam liu wrote:
>>>
>>>  I encountered IOException and FileNotFoundException:
>>>
>>> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id :
>>> attempt_201304160910_2135_m_
>>> 000000_0, Status : FAILED
>>> java.io.IOException: The temporary job-output directory
>>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't
>>>  exist!
>>>     at
>>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
>>>     at
>>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)
>>>     at
>>> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)
>>>     at
>>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:820)
>>>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>>>     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>     at
>>> java.security.AccessController.doPrivileged(AccessController.java:310)
>>>     at javax.security.auth.Subject.doAs(Subject.java:573)
>>>     at
>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144)
>>>     at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>
>>>
>>> ... ...
>>>
>>> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete:
>>> job_201304160910_2135
>>> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6
>>> 13/04/17 17:11:42 INFO mapred.JobClient:   Job Counters
>>> 13/04/17 17:11:42 INFO mapred.JobClient:     Failed map tasks=1
>>> 13/04/17 17:11:42 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=33785
>>> 13/04/17 17:11:42 INFO mapred.JobClient:     Launched map tasks=4
>>> 13/04/17 17:11:42 INFO mapred.JobClient:     Total time spent by all
>>> reduces waiting after reserving slots (ms)=0
>>> 13/04/17 17:11:42 INFO mapred.JobClient:     Total time spent by all
>>> maps waiting after reserving slots (ms)=0
>>> 13/04/17 17:11:42 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=6436
>>> 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map
>>> Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask:
>>> task_201304160910_2135_m_000000
>>> With failures, global counters are inaccurate; consider running with -i
>>> Copy failed: java.io.FileNotFoundException: File
>>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_tmp_i74spu does not
>>> exist.
>>>     at
>>> org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419)
>>>     at
>>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302)
>>>     at
>>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:279)
>>>     at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963)
>>>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672)
>>>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>>
>>>
>>> 2013/4/23 sam liu <samliuhad...@gmail.com>
>>>
>>>> I encountered IOException and FileNotFoundException:
>>>>
>>>> 13/04/17 17:11:10 INFO mapred.JobClient: Task Id :
>>>> attempt_201304160910_2135_m_000000_0, Status : FAILED
>>>> java.io.IOException: The temporary job-output directory
>>>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_logs_i74spu/_temporarydoesn't
>>>>  exist!
>>>>     at
>>>> org.apache.hadoop.mapred.FileOutputCommitter.getWorkPath(FileOutputCommitter.java:250)
>>>>     at
>>>> org.apache.hadoop.mapred.FileOutputFormat.getTaskOutputPath(FileOutputFormat.java:244)
>>>>     at
>>>> org.apache.hadoop.mapred.TextOutputFormat.getRecordWriter(TextOutputFormat.java:116)
>>>>     at
>>>> org.apache.hadoop.mapred.MapTask$DirectMapOutputCollector.<init>(MapTask.java:820)
>>>>     at org.apache.hadoop.mapred.MapTask.runOldMapper(MapTask.java:430)
>>>>     at org.apache.hadoop.mapred.MapTask.run(MapTask.java:372)
>>>>     at org.apache.hadoop.mapred.Child$4.run(Child.java:255)
>>>>     at
>>>> java.security.AccessController.doPrivileged(AccessController.java:310)
>>>>     at javax.security.auth.Subject.doAs(Subject.java:573)
>>>>     at
>>>> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1144)
>>>>     at org.apache.hadoop.mapred.Child.main(Child.java:249)
>>>>
>>>>
>>>> ... ...
>>>>
>>>> 13/04/17 17:11:42 INFO mapred.JobClient: Job complete:
>>>> job_201304160910_2135
>>>> 13/04/17 17:11:42 INFO mapred.JobClient: Counters: 6
>>>> 13/04/17 17:11:42 INFO mapred.JobClient:   Job Counters
>>>> 13/04/17 17:11:42 INFO mapred.JobClient:     Failed map tasks=1
>>>> 13/04/17 17:11:42 INFO mapred.JobClient:     SLOTS_MILLIS_MAPS=33785
>>>> 13/04/17 17:11:42 INFO mapred.JobClient:     Launched map tasks=4
>>>> 13/04/17 17:11:42 INFO mapred.JobClient:     Total time spent by all
>>>> reduces waiting after reserving slots (ms)=0
>>>> 13/04/17 17:11:42 INFO mapred.JobClient:     Total time spent by all
>>>> maps waiting after reserving slots (ms)=0
>>>> 13/04/17 17:11:42 INFO mapred.JobClient:     SLOTS_MILLIS_REDUCES=6436
>>>> 13/04/17 17:11:42 INFO mapred.JobClient: Job Failed: # of failed Map
>>>> Tasks exceeded allowed limit. FailedCount: 1. LastFailedTask:
>>>> task_201304160910_2135_m_000000
>>>> With failures, global counters are inaccurate; consider running with -i
>>>> Copy failed: java.io.FileNotFoundException: File
>>>> ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/_distcp_tmp_i74spu does not
>>>> exist.
>>>>     at
>>>> org.apache.hadoop.fs.ftp.FTPFileSystem.getFileStatus(FTPFileSystem.java:419)
>>>>     at
>>>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:302)
>>>>     at
>>>> org.apache.hadoop.fs.ftp.FTPFileSystem.delete(FTPFileSystem.java:279)
>>>>     at org.apache.hadoop.tools.DistCp.fullyDelete(DistCp.java:963)
>>>>     at org.apache.hadoop.tools.DistCp.copy(DistCp.java:672)
>>>>     at org.apache.hadoop.tools.DistCp.run(DistCp.java:881)
>>>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:65)
>>>>     at org.apache.hadoop.util.ToolRunner.run(ToolRunner.java:79)
>>>>     at org.apache.hadoop.tools.DistCp.main(DistCp.java:908)
>>>>
>>>>
>>>> 2013/4/23 Daryn Sharp <da...@yahoo-inc.com>
>>>>
>>>>> I believe it should work…  What error message did you receive?
>>>>>
>>>>> Daryn
>>>>>
>>>>> On Apr 22, 2013, at 3:45 AM, sam liu wrote:
>>>>>
>>>>> > Hi Experts,
>>>>> >
>>>>> > I failed to execute following command, does not Distcp support FTP
>>>>> protocol?
>>>>> >
>>>>> > hadoop distcp ftp://hadoopadm:xxxxxxxx@ftphostname/tmp/file1.txt
>>>>> > hdfs:///tmp/file1.txt
>>>>> >
>>>>> > Thanks!
>>>>>
>>>>>
>>>>
>>>
>>>
>>
>

Reply via email to