[
https://issues.apache.org/jira/browse/HADOOP-2050?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#action_12534716
]
Runping Qi commented on HADOOP-2050:
------------------------------------
It turned out to be a problem in CopyFile class.
After a mapper got killed due to failing to report progress,
a new attempt may be scheduled shortly, before the dfs lease hold
on the destination file by the failed mapper got expired.
When the new attempt tries to create
the destination file, an exception is thrown.
CopyFile should handle that exception and retry after sleeping for a short
while.
> distcp failed due to problem in creating files
> ----------------------------------------------
>
> Key: HADOOP-2050
> URL: https://issues.apache.org/jira/browse/HADOOP-2050
> Project: Hadoop
> Issue Type: Bug
> Components: mapred
> Affects Versions: 0.15.0
> Reporter: Runping Qi
>
> When I run a distcp program to copy files from one dfs to another, my job
> failed with
> the mappers throwing the following exception:
> org.apache.hadoop.ipc.RemoteException:
> org.apache.hadoop.dfs.AlreadyBeingCreatedException: failed to create file
> /xxxxx/part-00007 for DFSClient_task_200710122302_0002_m_000456_2 on client
> 72.30.43.23 because current leaseholder is trying to recreate file.
> at
> org.apache.hadoop.dfs.FSNamesystem.startFileInternal(FSNamesystem.java:850)
> at org.apache.hadoop.dfs.FSNamesystem.startFile(FSNamesystem.java:806)
> at org.apache.hadoop.dfs.NameNode.create(NameNode.java:333)
> at sun.reflect.GeneratedMethodAccessor13.invoke(Unknown Source)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:379)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:596)
> at org.apache.hadoop.ipc.Client.call(Client.java:482)
> at org.apache.hadoop.ipc.RPC$Invoker.invoke(RPC.java:184)
> at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
> at java.lang.reflect.Method.invoke(Method.java:597)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:82)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:59)
> at org.apache.hadoop.dfs.$Proxy1.create(Unknown Source)
> at org.apache.hadoop.dfs.DFSClient$DFSOutputStream.(DFSClient.java:1432)
> at org.apache.hadoop.dfs.DFSClient.create(DFSClient.java:376)
> at
> org.apache.hadoop.dfs.DistributedFileSystem.create(DistributedFileSystem.java:121)
> at
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.copy(CopyFiles.java:284)
> at
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:352)
> at
> org.apache.hadoop.util.CopyFiles$FSCopyFilesMapper.map(CopyFiles.java:217)
> at org.apache.hadoop.mapred.MapRunner.run(MapRunner.java:50)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:195)
> at
> org.apache.hadoop.mapred.TaskTracker$Child.main(TaskTracker.java:1750)
> It seems that this problem happened in the 2nd, 3rd, 4th attempts,
> after the first attemp failed.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.