[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2124?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12926444#action_12926444
 ] 

Chris Douglas commented on MAPREDUCE-2124:
------------------------------------------

Either way is OK with me, but I'm not wholly clear on its intended audience. 
The SLOT_MILLIS_\* counters are useful to operators and developers, as they 
provide information about the efficiency of the scheduler: they're useful for 
bottleneck analysis of repeated sets of jobs, tuning of the aggregate cluster, 
and comparing different runs of concurrent pipelines. By only accumulating the 
time that was actually spent doing work, the proposed counters could measure 
the efficiency of the job and be useful to the user, for tuning parameters like 
slowstart (a long shuffle time for small amounts of intermediate data might 
indicate that the job is scheduling reduces too early).

Most of the framework counters (FileSystem, framework bytes and records) 
provide feedback to the user, to help determine if their job is written 
correctly and tuned efficiently. This is slightly different, because it's not a 
property of a particular MapReduce job (e.g. a job where every reduce fails 
once could look "efficient" by this metric). I guess my question would be: if 
this information is presented in every user job, then how should (s)he react to 
it? If it's not user-centric and only another presentation of data the operator 
already has, then it seems less motivated to me. All that said, the cost is 
low, so if you feel it's useful then I've no objection to it.

> Add job counters for measuring time spent in three different phases in 
> reducers
> -------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2124
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2124
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: jobtracker
>    Affects Versions: 0.22.0
>            Reporter: Scott Chen
>            Assignee: Scott Chen
>            Priority: Minor
>             Fix For: 0.22.0
>
>         Attachments: MAPREDUCE-2124-v2.txt, MAPREDUCE-2124.txt
>
>
> We currently have SLOTS_MILLIS_REDUCES which measures the total slot time of 
> reducer.
> It will be useful if we have
> {code}
> SLOTS_MILLIS_REDUCES_COPY
> SLOTS_MILLIS_REDUCES_SORT
> SLOTS_MILLIS_REDUCES_REDUCE
> {code}
> which measures three different phases of a reducer.
> This will help us identify the bottleneck of the reducers.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to