[
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17364688#comment-17364688
]
Ayush Saxena commented on HADOOP-17763:
---------------------------------------
{quote} String fileListPathStr = context.getTargetPath() +
"/fileList.seq";{quote}
You are storing here in the target path. In case the delete missing option is
specified CopyCommitter#deleteMissing might delete this file as well?
Moreover, we have to explicitly manage the clean up of this file in all cases.
In case of snapshot based distcp. this would even modify the target directory,
and can potentially lead to inconsistency.
> DistCp job fails when AM is killed
> ----------------------------------
>
> Key: HADOOP-17763
> URL: https://issues.apache.org/jira/browse/HADOOP-17763
> Project: Hadoop Common
> Issue Type: Bug
> Reporter: Bilwa S T
> Assignee: Bilwa S T
> Priority: Major
> Attachments: HADOOP-17763.001.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task:
> attempt_1623387358383_0006_m_000000_1000 - exited :
> java.io.FileNotFoundException: File does not exist:
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
> at
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
> at
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1863)
> at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1886)
> at
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
> at
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
> at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
> at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
> at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
> | TaskAttemptListenerImpl.java:304{code}
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]