[ 
https://issues.apache.org/jira/browse/MAPREDUCE-901?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Devaraj Das updated MAPREDUCE-901:
----------------------------------

    Attachment: 901_1.patch

Attaching a patch for review. I am still testing the patch. Also, a little bit 
of cleanup is required especially w.r.t to naming variables/fields in the 
classes. I will do that in a follow up patch.

Some points on the approach:
1) Defined a class TaskMetrics that has methods for updating the counters 
defined in o.a.h.mapreduce.TaskCounter.java. It also provides a utility method 
to update framework Counters that aren't defined in TaskCounter.java. Examples 
of such counters are the counters that the framework defines in the 
countergroup FileSystemCounters. For the TaskCounter counters, the RPC is 
optimized. For the framework counters like the FileSystemCounters, RPC uses the 
Counters serialization. 
2) The above is serialized out as part of TaskStatus object in the heartbeats.
3) In TaskInProgress.java, the TIP's Counters is updated with the above 
counters obtained in the heartbeat.

Would really appreciate a review on this one.

And yes, this looks like a good thing to have for the jiras MAPREDUCE-220 and 
MAPREDUCE-718.

> Move Framework Counters into a TaskMetric structure
> ---------------------------------------------------
>
>                 Key: MAPREDUCE-901
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-901
>             Project: Hadoop Map/Reduce
>          Issue Type: Improvement
>          Components: task
>    Affects Versions: 0.21.0
>            Reporter: Owen O'Malley
>            Assignee: Devaraj Das
>             Fix For: 0.21.0
>
>         Attachments: 901_1.patch
>
>
> I think we should move all of the Counters that the framework updates into a 
> single class called TaskMetrics. TaskMetrics would have specific fields for 
> each of the metrics like input records, input bytes, output records, etc.
> It would both reduce the serialized size of the heartbeats (by shrinking the 
> Counters down to just the user's counters) and decrease the latency for 
> updates to the JobTracker (since Counters are sent at most 1/minute instead 
> of 1/heartbeat).

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to