[ https://issues.apache.org/jira/browse/MAPREDUCE-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12864065#action_12864065 ]
Scott Chen commented on MAPREDUCE-220: -------------------------------------- Hey guys, Thanks for the help. I am not familiar with the counters. But from Arun and Vinod's comments I can the see the benefits: 1. Reuse of the counter logging and transmitting 2. Easier to expose to end users This is really good! But as Dhruba mentioned, we want to use this information for scheduling. So measuring it and then sending it with the heart beat ensures the scheduler gets the latest information. One minute may be too slow for the scheduling. The other question I have is that Using counters, can we aggregate using other method (e.g. max) rather than just increment values? My original plan is to report these information in this issue and aggregate them into job level status in MAPREDUCE-1739. And I am planning to generate these fields after aggregation: 1. Total CPU cycles (# of giga-cycles) 2. Total Memory occupied time (GB-sec) 3. Maximum peak memory on one task (GB) 4. Maximum peak CPU on one task (GHz) Is it possible to get these fields by using the counters? I will read the relavent codes and think more about it. Maybe there's a way to obtain both benefit. Vinod: I also feel that there are lots of redundant creation/computation of processTree. Maybe we should refactor the codes and use one thread to compute it and expose the information to others. > Collecting cpu and memory usage for MapReduce tasks > --------------------------------------------------- > > Key: MAPREDUCE-220 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-220 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: task, tasktracker > Reporter: Hong Tang > Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-220-v1.txt, MAPREDUCE-220.txt > > > It would be nice for TaskTracker to collect cpu and memory usage for > individual Map or Reduce tasks over time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.