[ 
https://issues.apache.org/jira/browse/MAPREDUCE-11?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Amar Kamat updated MAPREDUCE-11:
--------------------------------

    Attachment: MAPREDUCE-11-v1.8.patch

Attaching a patch that simplifies the job history filename and recovery. 
Changes are as follows :
# job history filename is of the format _hostname_jobid_username_jobname_
# conf filenames are of the format _hostname_jobid_conf.xml_
# upon every restart all the new updates will be directed to 
_history-file.recover_
# once the job finishes the _history-file.recover_ file will be renamed to 
_history-file_
# note that the master file ( _hostname_jobid_username_jobname_) will exist 
throughout the lifecycle of the job
# if the jobtracker restart again, new updates will be lost
# there is no searching involved in any case
# for now the old jobhistory files are supported via web-ui

Tested the patch locally and so far no issues. Result of test-patch 
[exec] +1 overall.  
     [exec] 
     [exec]     +1 @author.  The patch does not contain any @author tags.
     [exec] 
     [exec]     +1 tests included.  The patch appears to include 9 new or 
modified tests.
     [exec] 
     [exec]     +1 javadoc.  The javadoc tool did not generate any warning 
messages.
     [exec] 
     [exec]     +1 javac.  The applied patch does not increase the total number 
of javac compiler warnings.
     [exec] 
     [exec]     +1 findbugs.  The patch does not introduce any new Findbugs 
warnings.
     [exec] 
     [exec]     +1 release audit.  The applied patch does not increase the 
total number of release audit warnings.


Running ant tests now and testing in progress.

Things I tested
# submitted a job allowed it to completed. New job files move to done folder. 
# submitted a job and killed the jobtracker  while job files was empty, 
restarted the jobtracker and the files upon completion move to done folder
# submitted a job and killed the jobtracker  while job files was written, 
restarted the jobtracker and the files upon completion move to done folder. job 
was also recovered
# checked webui
 ## history shows old and new files (there is no difference between the layout)
 ## history pages for old and new jobs have functional links (check random 
links and conf links)
 ## search facility in history works across files 

> Cleanup JobHistory file naming to do with job recovery
> ------------------------------------------------------
>
>                 Key: MAPREDUCE-11
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-11
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>            Reporter: Devaraj Das
>         Attachments: MAPREDUCE-11-v1.8.patch
>
>
> The JobTracker uses the job history files for doing job recovery upon 
> startup. To handle cases where JobTracker goes down again while the recovered 
> job is running, there is some logic that plays with files and it ends up 
> having two history files for some window of time during the life of the job - 
> actual history file, .recover file. The idea being that upon the next restart 
> we should be able to the maximal number of events for the job. It led to 
> performance problems in the job submission / recovery (part of which got 
> addressed in HADOOP-4372). It also looks pretty unlikely that a running job 
> will traverse across multiple JT restarts. Even if it did, without the 
> .recover file, it'd only mean that we lose some tasks that got completed in a 
> subsequent restart. I propose that we remove the .recover file logic and base 
> the recovery on only the original job history file. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to