[ 
https://issues.apache.org/jira/browse/HADOOP-5929?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12713850#action_12713850
 ] 

Devaraj Das commented on HADOOP-5929:
-------------------------------------

Also, we should remove the <date> field from the job history filename format. 
Anyway the date is already there in the jobID and that is a part of the 
filename.

> Cleanup JobHistory file naming to do with job recovery
> ------------------------------------------------------
>
>                 Key: HADOOP-5929
>                 URL: https://issues.apache.org/jira/browse/HADOOP-5929
>             Project: Hadoop Core
>          Issue Type: Bug
>          Components: mapred
>    Affects Versions: 0.19.0
>            Reporter: Devaraj Das
>             Fix For: 0.21.0
>
>
> The JobTracker uses the job history files for doing job recovery upon 
> startup. To handle cases where JobTracker goes down again while the recovered 
> job is running, there is some logic that plays with files and it ends up 
> having two history files for some window of time during the life of the job - 
> actual history file, .recover file. The idea being that upon the next restart 
> we should be able to the maximal number of events for the job. It led to 
> performance problems in the job submission / recovery (part of which got 
> addressed in HADOOP-4372). It also looks pretty unlikely that a running job 
> will traverse across multiple JT restarts. Even if it did, without the 
> .recover file, it'd only mean that we lose some tasks that got completed in a 
> subsequent restart. I propose that we remove the .recover file logic and base 
> the recovery on only the original job history file. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to