[ https://issues.apache.org/jira/browse/MAPREDUCE-220?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12887672#action_12887672 ]
M. C. Srivas commented on MAPREDUCE-220: ---------------------------------------- We've found that disk bandwidth is virtually unlimited compared to other factors, esp network, thus measuring/collecting it is not worthwhile for scheduling. More interesting is disk-ops-per-second-per-drive. It identifies bad data layout immediately (ie, one disk will be very hot even though it might be transferring very little data). Unfortunately, using ops / second / disk to schedule work is still not very useful, since bad data layout will not change because we schedule less. Network is a big bottleneck. But bytes-in/bytes-out per unit of time is not representative of a problem. IF we had some measure of the congestion, we could use it to increase/decrease scheduling locality (eg, if network gets congested, reduce %-age of non-local tasks). We need to know round-trip times under "normal" vs "congested" situations., dropped packet counts, retransmit counts, etc. to figure out metrics for congestion. (Perhaps add some sockopts to tell us this? TCP knows this, after all) CPU/memory/swapping still seem to be most useful therefore. > Collecting cpu and memory usage for MapReduce tasks > --------------------------------------------------- > > Key: MAPREDUCE-220 > URL: https://issues.apache.org/jira/browse/MAPREDUCE-220 > Project: Hadoop Map/Reduce > Issue Type: New Feature > Components: task, tasktracker > Reporter: Hong Tang > Assignee: Scott Chen > Fix For: 0.22.0 > > Attachments: MAPREDUCE-220-20100616.txt, MAPREDUCE-220-v1.txt, > MAPREDUCE-220.txt > > > It would be nice for TaskTracker to collect cpu and memory usage for > individual Map or Reduce tasks over time. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.