[jira] Created: (HADOOP-1876) Persisting completed jobs status

Alejandro Abdelnur (JIRA) Mon, 10 Sep 2007 23:46:55 -0700

Persisting completed jobs status
--------------------------------

                 Key: HADOOP-1876
                 URL: https://issues.apache.org/jira/browse/HADOOP-1876
             Project: Hadoop
          Issue Type: Improvement
          Components: mapred
         Environment: all
            Reporter: Alejandro Abdelnur
            Priority: Minor



Currently the JobTracker keeps information about completed jobs in memory. 

This information is  flushed from the cache when it has outlived 
(#RETIRE_JOB_INTERVAL) or because the limit of completed jobs in memory has 
been reach (#MAX_COMPLETE_USER_JOBS_IN_MEMORY). 

Also, if the JobTracker is restarted (due to being recycled or due to a crash) 
information about completed jobs is lost.

If any of the above scenarios happens before the job information is queried by 
a hadoop client (normally the job submitter or a monitoring component) there is 
no way to obtain such information.

A way to avoid this is the JobTracker to persist in DFS the completed jobs 
information upon job completion. This would be done at the time the job is 
moved to the completed jobs queue. Then when querying the JobTracker for 
information about a completed job, if it is not found in the memory queue, a 
lookup  in DFS would be done to retrieve the completed job information. 

A directory in DFS (under mapred/system) would be used to persist completed job 
information, for each completed job there would be a directory with the job ID, 
within that directory all the information about the job: status, jobprofile, 
counters and completion events.

A configuration property will indicate for how log persisted job information 
should be kept in DFS. After such period it will be cleaned up automatically.

This improvement would not introduce API changes.


-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Created: (HADOOP-1876) Persisting completed jobs status

Reply via email to