[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4770?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Jaikannan Ramamoorthy updated MAPREDUCE-4770:
---------------------------------------------

    Affects Version/s: 0.20.203.0
    
> Hadoop jobs failing with FileNotFound Exception while the job is still running
> ------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-4770
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4770
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: tasktracker
>    Affects Versions: 0.20.203.0
>            Reporter: Jaikannan Ramamoorthy
>
> We are having a strange issue in our Hadoop cluster. We have noticed that 
> some of our jobs fail with the with a file not found exception[see below]. 
> Basically the files in the "attempt_*" directory and the directory itself are 
> getting deleted while the task is still being run on the host. Looking 
> through some of the hadoop documentation I see that the job directory gets 
> wiped out when it gets a KillJobAction however I am not sure why it gets 
> wiped out while the job is still running.
> My question is what could be deleting it while the job is running? Any 
> thoughts or pointers on how to debug this would be helpful.
> Thanks!
> java.io.FileNotFoundException: 
> /hadoop/mapred/local_data/taskTracker//jobcache/job_201211030344_15383/attempt_201211030344_15383_m_000169_0/output/spill29.out
>  (Permission denied) at java.io.FileInputStream.open(Native Method) at 
> java.io.FileInputStream.(FileInputStream.java:120) at 
> org.apache.hadoop.fs.RawLocalFileSystem$TrackingFileInputStream.(RawLocalFileSystem.java:71)
>  at 
> org.apache.hadoop.fs.RawLocalFileSystem$LocalFSFileInputStream.(RawLocalFileSystem.java:107)
>  at org.apache.hadoop.fs.RawLocalFileSystem.open(RawLocalFileSystem.java:177) 
> at org.apache.hadoop.fs.FileSystem.open(FileSystem.java:400) at 
> org.apache.hadoop.mapred.Merger$Segment.init(Merger.java:205) at 
> org.apache.hadoop.mapred.Merger$Segment.access$100(Merger.java:165) at 
> org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:418) at 
> org.apache.hadoop.mapred.Merger$MergeQueue.merge(Merger.java:381) at 
> org.apache.hadoop.mapred.Merger.merge(Merger.java:77) at 
> org.apache.hadoop.mapred.MapTask$MapOutputBuffer.mergeParts(MapTask.java:1692)
>  at org.apache.hadoop.mapred.MapTask$MapOutputBuffer.flush(MapTask.java:1322) 
> at 
> org.apache.hadoop.mapred.MapTask$NewOutputCollector.close(MapTask.java:698) 
> at org.apache.hadoop.mapred.MapTask.runNewMapper(MapTask.java:765) at 
> org.apache.hadoop.mapred.MapTask.run(MapTask.java:369) at 
> org.apache.hadoop.mapred.Child$4.run(Child.java:259) at 
> java.security.AccessController.doPrivileged(Native Method) at 
> javax.security.auth.Subject.doAs(Subject.java:396) at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1059)
>  at org.apache.hadoop.mapred.Child.main(Child.java:253)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to