[ https://issues.apache.org/jira/browse/MAPREDUCE-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Amar Kamat updated MAPREDUCE-11: -------------------------------- Attachment: MAPREDUCE-11-v1.8.patch Attaching a patch that simplifies the job history filename and recovery. Changes are as follows : # job history filename is of the format _hostname_jobid_username_jobname_ # conf filenames are of the format _hostname_jobid_conf.xml_ # upon every restart all the new updates will be directed to _history-file.recover_ # once the job finishes the _history-file.recover_ file will be renamed to _history-file_ # note that the master file ( _hostname_jobid_username_jobname_) will exist throughout the lifecycle of the job # if the jobtracker restart again, new updates will be lost # there is no searching involved in any case # for now the old jobhistory files are supported via web-ui Tested the patch locally and so far no issues. Result of test-patch [exec] +1 overall. [exec] [exec] +1 @author. The patch does not contain any @author tags. [exec] [exec] +1 tests included. The patch appears to include 9 new or modified tests. [exec] [exec] +1 javadoc. The javadoc tool did not generate any warning messages. [exec] [exec] +1 javac. The applied patch does not increase the total number of javac compiler warnings. [exec] [exec] +1 findbugs. The patch does not introduce any new Findbugs warnings. [exec] [exec] +1 release audit. The applied patch does not increase the total number of release audit warnings. Running ant tests now and testing in progress. Things I tested # submitted a job allowed it to completed. New job files move to done folder. # submitted a job and killed the jobtracker while job files was empty, restarted the jobtracker and the files upon completion move to done folder # submitted a job and killed the jobtracker while job files was written, restarted the jobtracker and the files upon completion move to done folder. job was also recovered # checked webui ## history shows old and new files (there is no difference between the layout) ## history pages for old and new jobs have functional links (check random links and conf links) ## search facility in history works across files > Cleanup JobHistory file naming to do with job recovery > ------------------------------------------------------ > > Key: MAPREDUCE-11 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-11 > Project: Hadoop Map/Reduce > Issue Type: Bug > Reporter: Devaraj Das > Attachments: MAPREDUCE-11-v1.8.patch > > > The JobTracker uses the job history files for doing job recovery upon > startup. To handle cases where JobTracker goes down again while the recovered > job is running, there is some logic that plays with files and it ends up > having two history files for some window of time during the life of the job - > actual history file, .recover file. The idea being that upon the next restart > we should be able to the maximal number of events for the job. It led to > performance problems in the job submission / recovery (part of which got > addressed in HADOOP-4372). It also looks pretty unlikely that a running job > will traverse across multiple JT restarts. Even if it did, without the > .recover file, it'd only mean that we lose some tasks that got completed in a > subsequent restart. I propose that we remove the .recover file logic and base > the recovery on only the original job history file. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.