[
https://issues.apache.org/jira/browse/MAPREDUCE-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Amar Kamat updated MAPREDUCE-11:
--------------------------------
Attachment: MAPREDUCE-11-v1.8.patch
Attaching a patch that simplifies the job history filename and recovery.
Changes are as follows :
# job history filename is of the format _hostname_jobid_username_jobname_
# conf filenames are of the format _hostname_jobid_conf.xml_
# upon every restart all the new updates will be directed to
_history-file.recover_
# once the job finishes the _history-file.recover_ file will be renamed to
_history-file_
# note that the master file ( _hostname_jobid_username_jobname_) will exist
throughout the lifecycle of the job
# if the jobtracker restart again, new updates will be lost
# there is no searching involved in any case
# for now the old jobhistory files are supported via web-ui
Tested the patch locally and so far no issues. Result of test-patch
[exec] +1 overall.
[exec]
[exec] +1 @author. The patch does not contain any @author tags.
[exec]
[exec] +1 tests included. The patch appears to include 9 new or
modified tests.
[exec]
[exec] +1 javadoc. The javadoc tool did not generate any warning
messages.
[exec]
[exec] +1 javac. The applied patch does not increase the total number
of javac compiler warnings.
[exec]
[exec] +1 findbugs. The patch does not introduce any new Findbugs
warnings.
[exec]
[exec] +1 release audit. The applied patch does not increase the
total number of release audit warnings.
Running ant tests now and testing in progress.
Things I tested
# submitted a job allowed it to completed. New job files move to done folder.
# submitted a job and killed the jobtracker while job files was empty,
restarted the jobtracker and the files upon completion move to done folder
# submitted a job and killed the jobtracker while job files was written,
restarted the jobtracker and the files upon completion move to done folder. job
was also recovered
# checked webui
## history shows old and new files (there is no difference between the layout)
## history pages for old and new jobs have functional links (check random
links and conf links)
## search facility in history works across files
> Cleanup JobHistory file naming to do with job recovery
> ------------------------------------------------------
>
> Key: MAPREDUCE-11
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-11
> Project: Hadoop Map/Reduce
> Issue Type: Bug
> Reporter: Devaraj Das
> Attachments: MAPREDUCE-11-v1.8.patch
>
>
> The JobTracker uses the job history files for doing job recovery upon
> startup. To handle cases where JobTracker goes down again while the recovered
> job is running, there is some logic that plays with files and it ends up
> having two history files for some window of time during the life of the job -
> actual history file, .recover file. The idea being that upon the next restart
> we should be able to the maximal number of events for the job. It led to
> performance problems in the job submission / recovery (part of which got
> addressed in HADOOP-4372). It also looks pretty unlikely that a running job
> will traverse across multiple JT restarts. Even if it did, without the
> .recover file, it'd only mean that we lose some tasks that got completed in a
> subsequent restart. I propose that we remove the .recover file logic and base
> the recovery on only the original job history file.
--
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.