Cleanup JobHistory file naming to do with job recovery
------------------------------------------------------

                 Key: HADOOP-5929
                 URL: https://issues.apache.org/jira/browse/HADOOP-5929
             Project: Hadoop Core
          Issue Type: Bug
          Components: mapred
    Affects Versions: 0.19.0
            Reporter: Devaraj Das
             Fix For: 0.21.0


The JobTracker uses the job history files for doing job recovery upon startup. 
To handle cases where JobTracker goes down again while the recovered job is 
running, there is some logic that plays with files and it ends up having two 
history files for some window of time during the life of the job - actual 
history file, .recover file. The idea being that upon the next restart we 
should be able to the maximal number of events for the job. It led to 
performance problems in the job submission / recovery (part of which got 
addressed in HADOOP-4372). It also looks pretty unlikely that a running job 
will traverse across multiple JT restarts. Even if it did, without the .recover 
file, it'd only mean that we lose some tasks that got completed in a subsequent 
restart. I propose that we remove the .recover file logic and base the recovery 
on only the original job history file. 

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to