[ 
https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12906680#action_12906680
 ] 

Vinod K V commented on MAPREDUCE-2037:
--------------------------------------

I didn't realize before but MAPREDUCE-220 captures the cpu/memory load at the 
time of task completion. So the core functionality is already there in trunk.

But the load at the time of task completion isn't really a useful stat. +1 for 
either exponential smoothing or a simpler capturing of highest,lowest and 
average loads for cpu and memory.

> Capturing interim progress times, CPU usage, and memory usage, when tasks 
> reach certain progress thresholds
> -----------------------------------------------------------------------------------------------------------
>
>                 Key: MAPREDUCE-2037
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>            Reporter: Dick King
>            Assignee: Dick King
>             Fix For: 0.22.0
>
>
> We would like to capture the following information at certain progress 
> thresholds as a task runs:
>    * Time taken so far
>    * CPU load [either at the time the data are taken, or exponentially 
> smoothed]
>    * Memory load [also either at the time the data are taken, or 
> exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.  
> For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and 
> (2/3-3/3] -- where fundamentally different activities happen.  Mappers have 
> different boundaries, I understand, that are not symmetrically placed.  Data 
> capture boundaries should coincide with activity boundaries.  For the state 
> information capture [CPU and memory] we should average over the covered 
> interval.
> This data would flow in with the heartbeats.  It would be placed in the job 
> history as part of the task attempt completion event, so it could be 
> processed by rumen or some similar tool and could drive a benchmark engine.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to