[
https://issues.apache.org/jira/browse/MAPREDUCE-873?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12748244#action_12748244
]
Hadoop QA commented on MAPREDUCE-873:
-------------------------------------
+1 overall. Here are the results of testing the latest attachment
http://issues.apache.org/jira/secure/attachment/12417718/873_v3.patch
against trunk revision 808082.
+1 @author. The patch does not contain any @author tags.
+1 tests included. The patch appears to include 21 new or modified tests.
+1 javadoc. The javadoc tool did not generate any warning messages.
+1 javac. The applied patch does not increase the total number of javac
compiler warnings.
+1 findbugs. The patch does not introduce any new Findbugs warnings.
+1 release audit. The applied patch does not increase the total number of
release audit warnings.
+1 core tests. The patch passed core unit tests.
+1 contrib tests. The patch passed contrib unit tests.
Test results:
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/522/testReport/
Findbugs warnings:
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/522/artifact/trunk/build/test/findbugs/newPatchFindbugsWarnings.html
Checkstyle results:
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/522/artifact/trunk/build/test/checkstyle-errors.html
Console output:
http://hudson.zones.apache.org/hudson/job/Mapreduce-Patch-vesta.apache.org/522/console
This message is automatically generated.
> Simplify Job Recovery
> ---------------------
>
> Key: MAPREDUCE-873
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-873
> Project: Hadoop Map/Reduce
> Issue Type: Improvement
> Components: jobtracker
> Affects Versions: 0.20.1
> Reporter: Devaraj Das
> Assignee: Sharad Agarwal
> Fix For: 0.21.0
>
> Attachments: 873_v1.patch, 873_v2.patch, 873_v3.patch
>
>
> On a couple of occasions we have seen the JobTracker not being able to handle
> job recovery well, and leading to cluster downtime after a restart. The
> current design for handling job recovery is complex and prone to corner cases
> not being handled well enough. In retrospect, it seems like the transaction
> log based approach as was proposed on HADOOP-3245
> (http://tinyurl.com/luh9hb), would have been a better/simpler model. However,
> that is a big project, and it seems for the medium term, just handling job
> re-submissions after a restart is a good tradeoff. That is, the JobTracker
> after getting restarted, will resubmit all jobs that were running in its past
> life. They will all start from the beginning (downside is completed tasks
> will reexecute). In the long term, the transaction log model or some variant
> of that should be pursued.
> Thoughts/comments welcome.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.