[ 
https://issues.apache.org/jira/browse/MAPREDUCE-864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12751311#action_12751311
 ] 

Sharad Agarwal commented on MAPREDUCE-864:
------------------------------------------

bq. Job client will read the history file from HDFS location and construct the 
job information using JobHistory parser (provided as part of MAPREDUCE-157). 
While calling api's like Job#getCounters, Job clients would be transparent to 
the fact that information is being served from Jobtracker or from parsed 
history data.
I realized that this may have issues since there are calls (such as 
getTaskCompletionEvents, getTaskReports etc.) which need task level information 
as well. One option here is to load all history data in Job, but that leads to 
loading of huge datastructures in memory and many clients may not at all be 
interested in drilling down this task level stuff. Also reading from history 
transparently may confuse clients as the cost and performance impact of the 
same call will change drastically depending on the source of information. 
I am inclining to NOT have org.apache.hadoop.mapreduce.Job serve data from 
history. Let clients directly use JobHistory parser API to construct the info 
they need when job is knocked out of job tracker's memory. Thoughts ?



> Enhance JobClient API implementations to look at history files to get 
> information about jobs that are not in memory
> -------------------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-864
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-864
>             Project: Hadoop Map/Reduce
>          Issue Type: Sub-task
>          Components: jobtracker
>            Reporter: Devaraj Das
>            Assignee: Sharad Agarwal
>             Fix For: 0.21.0
>
>
> MAPREDUCE-817 added an API to get the JobHistory URL from the JobTracker. 
> This is useful in two ways:
> 1) Users can use this API to get the URL, copy the history files to their 
> local disk, and, do processing on them
> 2) APIs like JobSubmissionProtocol.getJobCounters, can read a part of the 
> history file, and then return the information to the caller (if the job is 
> not there in JT memory). This would  mimic most of the 
> CompletedJobsStatusStore functionality.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to