[ https://issues.apache.org/jira/browse/MAPREDUCE-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751311#action_12751311 ]
Sharad Agarwal commented on MAPREDUCE-864: ------------------------------------------ bq. Job client will read the history file from HDFS location and construct the job information using JobHistory parser (provided as part of MAPREDUCE-157). While calling api's like Job#getCounters, Job clients would be transparent to the fact that information is being served from Jobtracker or from parsed history data. I realized that this may have issues since there are calls (such as getTaskCompletionEvents, getTaskReports etc.) which need task level information as well. One option here is to load all history data in Job, but that leads to loading of huge datastructures in memory and many clients may not at all be interested in drilling down this task level stuff. Also reading from history transparently may confuse clients as the cost and performance impact of the same call will change drastically depending on the source of information. I am inclining to NOT have org.apache.hadoop.mapreduce.Job serve data from history. Let clients directly use JobHistory parser API to construct the info they need when job is knocked out of job tracker's memory. Thoughts ? > Enhance JobClient API implementations to look at history files to get > information about jobs that are not in memory > ------------------------------------------------------------------------------------------------------------------- > > Key: MAPREDUCE-864 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-864 > Project: Hadoop Map/Reduce > Issue Type: Sub-task > Components: jobtracker > Reporter: Devaraj Das > Assignee: Sharad Agarwal > Fix For: 0.21.0 > > > MAPREDUCE-817 added an API to get the JobHistory URL from the JobTracker. > This is useful in two ways: > 1) Users can use this API to get the URL, copy the history files to their > local disk, and, do processing on them > 2) APIs like JobSubmissionProtocol.getJobCounters, can read a part of the > history file, and then return the information to the caller (if the job is > not there in JT memory). This would mimic most of the > CompletedJobsStatusStore functionality. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.