[jira] Commented: (MAPREDUCE-740) Provide summary information per job once a job is finished.

Hong Tang (JIRA) Thu, 09 Jul 2009 13:20:40 -0700

    [ 
https://issues.apache.org/jira/browse/MAPREDUCE-740?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12729400#action_12729400
 ]


Hong Tang commented on MAPREDUCE-740:
-------------------------------------

@vinod

I do have a specific usage case where we want to keep track of the amount of 
resources being used by each job, each user, or each queue (for capacity 
scheduler). Granted, all these information is readily available in job history 
log. However, there are a few drawbacks by depending on job history logs: (1) 
we are interested in keeping a history of finished and possibly do group-by for 
user and queue. so scrapping individual history log is messy; (2) the added 
dependency to keep up with possible future changes to the history log format.

For starter, I think the summary should include the following information: 
        - job queuing/waiting time
        - job start time
        - job finish time
        - total maps/reduces
        - user id
        - job id (job-tracker ID + job sequence number)
        - map/reduce slot hours (need to apply multiplier for high ram tasks 
that take multiple slots per map/reduce task)
        - queue name
        - job status (success or failure)
        - cluster map/reduce slot capacity

The only thing that job history log does not provide currently is the slot 
hours for all maps and reduces belonging to the same job.

> Provide summary information per job once a job is finished.
> -----------------------------------------------------------
>
>                 Key: MAPREDUCE-740
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-740
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>            Reporter: Hong Tang
>            Priority: Minor
>
> It would be nice if JobTracker can output a one line summary information per 
> job once a job is finished. Otherwise, users or system administrators would 
> end up scraping individual job history logs.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

[jira] Commented: (MAPREDUCE-740) Provide summary information per job once a job is finished.

Reply via email to