[ 
https://issues.apache.org/jira/browse/MAPREDUCE-1304?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Chris Douglas updated MAPREDUCE-1304:
-------------------------------------

    Status: Open  (was: Patch Available)

Great idea. A few comments on the impl:
* This has no effect:
{noformat}
+    job.getConfiguration().set("io.sort.record.pct", "0.50");
{noformat}
* Moving the update of the GC counter to {{Task::updateCounters}} makes sense. 
To capture any {{FileSystem}} activity in the committer, the call to 
{{updateCounters}} in {{Task::done}} should be after the commit, anyway. This 
is a bug in the current code.
* Adding {{java.lang.management}} components shouldn't be widening the 
{{TaskReporter}} API and adding fields. There should be an {{\*Updater}} object 
tracking the particular statistic being tracked (as in the {{FileSystem}} 
counters) rather than letting the {{TaskReporter}} keep this state.
* Updating the GC counters must be synchronized or it risks reporting incorrect 
results (moving the update to {{Task::updateCounters}} would be sufficient).

> Add counters for task time spent in GC
> --------------------------------------
>
>                 Key: MAPREDUCE-1304
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-1304
>             Project: Hadoop Map/Reduce
>          Issue Type: New Feature
>          Components: task
>            Reporter: Todd Lipcon
>            Assignee: Aaron Kimball
>         Attachments: MAPREDUCE-1304.2.patch, MAPREDUCE-1304.patch
>
>
> It's easy to grab the number of millis spent in GC (see JvmMetrics for 
> example). Exposing these as task counters would be handy - occasionally I've 
> seen user jobs where long GC pauses cause big "unexplainable" performance 
> problems, and a large counter would make it obvious to the user what's going 
> on.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to