[ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Arun C Murthy updated MAPREDUCE-3837:
-------------------------------------

    Attachment: MAPREDUCE-3837_addendum.patch

I see this on a single node cluster.

Without this patch, tasks which are re-run fail with:

{noformat}

2012-07-11 05:43:18,299 INFO org.apache.hadoop.mapred.TaskInProgress: Error 
from attempt_201207110542_0001_m_000000_0: java.lang.Throwable: Child Error
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:271)
Caused by: java.io.IOException: Creation of 
/tmp/hadoop-acmurthy/mapred/local/userlogs/job_201207110542_0001/attempt_201207110542_0001_m_000000_0
 failed.
        at 
org.apache.hadoop.mapred.TaskLog.createTaskAttemptLogDir(TaskLog.java:104)
        at 
org.apache.hadoop.mapred.DefaultTaskController.createLogDir(DefaultTaskController.java:71)
        at 
org.apache.hadoop.mapred.TaskRunner.prepareLogFiles(TaskRunner.java:316)
        at org.apache.hadoop.mapred.TaskRunner.run(TaskRunner.java:228)
{noformat}



The problem is that mkdirs (at least on mac-osx) returns false if the directory 
exists and wasn't created during the call. 

Straight-fwd patch to check for existence fixes it.
                
> Job tracker is not able to recover job in case of crash and after that no 
> user can submit job.
> ----------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3837
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>    Affects Versions: 0.22.0, 1.1.1
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>             Fix For: 1.2.0, 0.22.1
>
>         Attachments: MAPREDUCE-3837_addendum.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837-3.patch, PATCH-HADOOP-1-MAPREDUCE-3837-4.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837.patch, PATCH-MAPREDUCE-3837.patch, 
> PATCH-TRUNK-MAPREDUCE-3837.patch
>
>
> If job tracker is crashed while running , and there were some jobs are 
> running , so if job tracker's property mapreduce.jobtracker.restart.recover 
> is true then it should recover the job.
> However the current behavior is as follows
> jobtracker try to restore the jobs but it can not . And after that jobtracker 
> closes its handle to hdfs and nobody else can submit job. 
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to