[
https://issues.apache.org/jira/browse/MAPREDUCE-2037?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13085220#comment-13085220
]
Hudson commented on MAPREDUCE-2037:
-----------------------------------
Integrated in Hadoop-Common-trunk-Commit #742 (See
[https://builds.apache.org/job/Hadoop-Common-trunk-Commit/742/])
MAPREDUCE-2037. Capture intermediate progress, CPU and memory usage for
tasks. Contributed by Dick King.
acmurthy :
http://svn.apache.org/viewcvs.cgi/?root=Apache-SVN&view=rev&rev=1157253
Files :
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/AvroArrayUtils.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/MapTaskAttemptInfo.java
* /hadoop/common/trunk/mapreduce/src/java/mapred-default.xml
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/TaskInProgress.java
* /hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/Counters.java
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/Events.avpr
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/TaskAttemptUnsuccessfulCompletionEvent.java
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/StatePeriodicStats.java
*
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/tools/rumen/TestRumenJobTraces.java
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/ReduceAttemptFinishedEvent.java
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/server/jobtracker/JTConfig.java
*
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapred/TestTaskPerformanceSplits.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/ZombieJob.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/ReduceAttempt20LineHistoryEventEmitter.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/TaskAttemptInfo.java
*
/hadoop/common/trunk/mapreduce/src/test/mapred/org/apache/hadoop/mapreduce/jobhistory/TestJobHistoryEvents.java
* /hadoop/common/trunk/mapreduce/CHANGES.txt
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/CumulativePeriodicStats.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/ReduceTaskAttemptInfo.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/TaskAttempt20LineEventEmitter.java
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/JobInProgress.java
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/PeriodicStatsAccumulator.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/JobBuilder.java
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapred/ProgressSplitsBlock.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/LoggedTaskAttempt.java
*
/hadoop/common/trunk/mapreduce/src/java/org/apache/hadoop/mapreduce/jobhistory/MapAttemptFinishedEvent.java
*
/hadoop/common/trunk/mapreduce/src/tools/org/apache/hadoop/tools/rumen/MapAttempt20LineHistoryEventEmitter.java
> Capturing interim progress times, CPU usage, and memory usage, when tasks
> reach certain progress thresholds
> -----------------------------------------------------------------------------------------------------------
>
> Key: MAPREDUCE-2037
> URL: https://issues.apache.org/jira/browse/MAPREDUCE-2037
> Project: Hadoop Map/Reduce
> Issue Type: New Feature
> Reporter: Dick King
> Assignee: Dick King
> Fix For: 0.23.0
>
> Attachments: MAPREDUCE-2037.patch, MAPREDUCE-2037.patch
>
>
> We would like to capture the following information at certain progress
> thresholds as a task runs:
> * Time taken so far
> * CPU load [either at the time the data are taken, or exponentially
> smoothed]
> * Memory load [also either at the time the data are taken, or
> exponentially smoothed]
> This would be taken at intervals that depend on the task progress plateaus.
> For example, reducers have three progress ranges -- [0-1/3], (1/3-2/3], and
> (2/3-3/3] -- where fundamentally different activities happen. Mappers have
> different boundaries, I understand, that are not symmetrically placed. Data
> capture boundaries should coincide with activity boundaries. For the state
> information capture [CPU and memory] we should average over the covered
> interval.
> This data would flow in with the heartbeats. It would be placed in the job
> history as part of the task attempt completion event, so it could be
> processed by rumen or some similar tool and could drive a benchmark engine.
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira