[ 
https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534716
 ] 

Runping Qi commented on HADOOP-2050:
------------------------------------

It turned out to be a problem in CopyFile class.
After a mapper got killed  due to failing to report  progress,
a new attempt may be scheduled shortly, before the dfs lease hold
on the destination file by the failed mapper got expired. 
When the new attempt tries to create 
the destination file, an exception is thrown.

CopyFile should handle that exception and retry after sleeping for a short 
while.



> distcp failed due to problem in creating files
> ----------------------------------------------
>
>                 Key: HADOOP-2050
>                 URL: https://issues.apache.org/jira/browse/HADOOP-2050
>             Project: Hadoop
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.15.0
>            Reporter: Runping Qi
>
> When I run a distcp program to copy files from one dfs to another, my job 
> failed with
> the mappers throwing the following exception:
> org.apache.hadoop.ipc.RemoteException: 
> org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create file 
> /xxxxx/part-00007 for DFSClient_task_200710122302_0002_m_000456_2 on client 
> 72.30.43.23 because current leaseholder is trying to recreate file.
>       at 
> org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNamesystem.java:850)
>       at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:806)
>       at org.apache.hadoop.dfs.NameNode.create(NameNode.java:333)
>       at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
>       at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
>       at org.apache.hadoop.ipc.Client.call(Client.java:482)
>       at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
>       at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
>       at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
>       at 
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
>       at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
>       at java.lang.reflect.Method.invoke(Method.java:597)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
>       at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
>       at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
>       at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.(DFSClient.java:1432)
>       at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:376)
>       at 
> org.apache.hadoop.dfs.DistributedFileSystem.create(DistributedFileSystem.java:121)
>       at 
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:284)
>       at 
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:352)
>       at 
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:217)
>       at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
>       at org.apache.hadoop.mapred.MapTask.run(MapTask.java:195)
>       at 
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1750)
> It seems that this problem happened  in the 2nd, 3rd, 4th attempts,
> after the first attemp failed.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to