[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

Ayush Saxena (Jira) Sat, 26 Jun 2021 00:51:07 -0700


    [ 
https://issues.apache.org/jira/browse/HADOOP-17763?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17369833#comment-17369833
 ]


Ayush Saxena commented on HADOOP-17763:
---------------------------------------

{quote}Tasks fails as we use staging directory to store split files and this 
same directory gets deleted whenever AM relaunches. So we should avoid storing 
split files in staging directory like other mapreduce applications. 
{quote}
Not very sure how MapReduce works in case of AM failure. I thought you said the 
entire staging directory gets deleted in case the AM is aborted. So, in that 
case how 'a new folder inside staging directory itself' will be saved from 
deletion?

 

But, if things works as you said, feel free to go ahead updating the patch. :) 

> DistCp job fails when AM is killed
> ----------------------------------
>
>                 Key: HADOOP-17763
>                 URL: https://issues.apache.org/jira/browse/HADOOP-17763
>             Project: Hadoop Common
>          Issue Type: Bug
>            Reporter: Bilwa S T
>            Assignee: Bilwa S T
>            Priority: Major
>         Attachments: HADOOP-17763.001.patch
>
>
> Job fails as tasks fail with below exception
> {code:java}
> 2021-06-11 18:48:47,047 | ERROR | IPC Server handler 0 on 27101 | Task: 
> attempt_1623387358383_0006_m_000000_1000 - exited : 
> java.io.FileNotFoundException: File does not exist: 
> hdfs://hacluster/staging-dir/dsperf/.staging/_distcp-646531269/fileList.seq
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1637)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem$29.doCall(DistributedFileSystem.java:1630)
>  at 
> org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81)
>  at 
> org.apache.hadoop.hdfs.DistributedFileSystem.getFileStatus(DistributedFileSystem.java:1645)
>  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1863)
>  at org.apache.hadoop.io.SequenceFile$Reader.<init>(SequenceFile.java:1886)
>  at 
> org.apache.hadoop.mapreduce.lib.input.SequenceFileRecordReader.initialize(SequenceFileRecordReader.java:54)
>  at 
> org.apache.hadoop.mapred.MapTask$NewTrackingRecordReader.initialize(MapTask.java:560)
>  at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:798)
>  at org.apache.hadoop.mapred.MapTask.run(MapTask.java:347)
>  at org.apache.hadoop.mapred.YarnChild$1.run(YarnChild.java:183)
>  at java.security.AccessController.doPrivileged(Native Method)
>  at javax.security.auth.Subject.doAs(Subject.java:422)
>  at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1761)
>  at org.apache.hadoop.mapred.YarnChild.main(YarnChild.java:177)
>  | TaskAttemptListenerImpl.java:304{code}



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (HADOOP-17763) DistCp job fails when AM is killed

Reply via email to