[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

Tom White (JIRA) Wed, 27 Jun 2012 12:27:46 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-3837?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13402503#comment-13402503
 ]


Tom White commented on MAPREDUCE-3837:
--------------------------------------

Mayank - thanks for the changes. Here's my feedback:

* If there is no need for restart count anymore - since jobs are re-run from 
the beginning each time - then would it be cleaner to remove it entirely?
* In JobTracker you changed "shouldRecover = false;" to "shouldRecover = true;" 
without updating the comment on the line before. (This might be related to the 
previous point about not having restart counts.)
* Remove the @Ignore annotation from TestRecoveryManager and the comment about 
MAPREDUCE-873.
* The new test testJobresubmission (should be testJobResubmission) should test 
that the job succeeded after the restart. Also, there's no reason to run it as 
a high-priority job.
* There's a comment saying it is a "faulty job" - which it isn't.
* Have setUp and tearDown methods to start and stop the cluster. At the moment 
there is code duplication, and clusters won't be shut down cleanly on failure.
* testJobTracker would be better named testJobTrackerRestartsWithMissingJobFile
* testRecoveryManager would be better named testJobTrackerRestartWithBadJobs
* There are multiple typos and formatting errors (including indentation, which 
should be 2 spaces) in the new code. See Konstantin's comment above.
* TestJobTrackerRestartWithLostTracker still fails, as does 
TestJobTrackerSafeMode. These should be fixed as a part of this work.

                
> Hadoop 22 Job tracker is not able to recover job in case of crash and after 
> that no user can submit job.
> --------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-3837
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-3837
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>    Affects Versions: 0.22.0
>            Reporter: Mayank Bansal
>            Assignee: Mayank Bansal
>             Fix For: 0.24.0, 0.22.1, 0.23.2
>
>         Attachments: PATCH-HADOOP-1-MAPREDUCE-3837-1.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837-2.patch, PATCH-HADOOP-1-MAPREDUCE-3837-3.patch, 
> PATCH-HADOOP-1-MAPREDUCE-3837.patch, PATCH-MAPREDUCE-3837.patch, 
> PATCH-TRUNK-MAPREDUCE-3837.patch
>
>
> If job tracker is crashed while running , and there were some jobs are 
> running , so if job tracker's property mapreduce.jobtracker.restart.recover 
> is true then it should recover the job.
> However the current behavior is as follows
> jobtracker try to restore the jobs but it can not . And after that jobtracker 
> closes its handle to hdfs and nobody else can submit job. 
> Thanks,
> Mayank

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

[jira] [Commented] (MAPREDUCE-3837) Hadoop 22 Job tracker is not able to recover job in case of crash and after that no user can submit job.

Reply via email to