[jira] Commented: (MAPREDUCE-873) Simplify Job Recovery

Hadoop QA (JIRA) Wed, 26 Aug 2009 20:07:24 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748244#action_12748244
 ]


Hadoop QA commented on MAPREDUCE-873:
-------------------------------------

+1 overall.  Here are the results of testing the latest attachment 
  http://issues.apache.org/jira/secure/attachment/12417718/873_v3.patch
  against trunk revision 808082.

    +1 @author.  The patch does not contain any @author tags.

    +1 tests included.  The patch appears to include 21 new or modified tests.

    +1 javadoc.  The javadoc tool did not generate any warning messages.

    +1 javac.  The applied patch does not increase the total number of javac 
compiler warnings.

    +1 findbugs.  The patch does not introduce any new Findbugs warnings.

    +1 release audit.  The applied patch does not increase the total number of 
release audit warnings.

    +1 core tests.  The patch passed core unit tests.

    +1 contrib tests.  The patch passed contrib unit tests.

Test results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/522/testReport/
Findbugs warnings: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/522/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/522/artifact/trunk/build/test/checkstyle-errors.html
Console output: 
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/522/console

This message is automatically generated.

> Simplify Job Recovery
> ---------------------
>
>                 Key: MAPREDUCE-873
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-873
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>    Affects Versions: 0.20.1
>            Reporter: Devaraj Das
>            Assignee: Sharad Agarwal
>             Fix For: 0.21.0
>
>         Attachments: 873_v1.patch, 873_v2.patch, 873_v3.patch
>
>
> On a couple of occasions we have seen the JobTracker not being able to handle 
> job recovery well, and leading to cluster downtime after a restart. The 
> current design for handling job recovery is complex and prone to corner cases 
> not being handled well enough. In retrospect, it seems like the transaction 
> log based approach as was proposed on HADOOP-3245 
> (http://tinyurl.com/luh9hb), would have been a better/simpler model. However, 
> that is a big project, and it seems for the medium term, just handling job 
> re-submissions after a restart is a good tradeoff. That is, the JobTracker 
> after getting restarted, will resubmit all jobs that were running in its past 
> life. They will all start from the beginning (downside is completed tasks 
> will reexecute). In the long term, the transaction log model or some variant 
> of that should be pursued.
> Thoughts/comments welcome.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-873) Simplify Job Recovery

Reply via email to