[ 
https://issues.apache.org/jira/browse/MAPREDUCE-4469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13429498#comment-13429498
 ] 

Todd Lipcon commented on MAPREDUCE-4469:
----------------------------------------

I agree that it's just those counters that are getting set by this code path. 
But, they're useful counters. So while it's nice to be able to disable it for a 
speed boost, I think we should also look into whether there is any more 
efficient way we can provide those counters. Does anyone have an idea about 
this?

Lacking a more efficient way of determining your child process hierarchy, we 
could probably do some optimization on the code to filter out the tasks that 
are actually opened and looked at, eg:
- only look at directories in /proc which are owned by the current user
- only look at directories which were created more recently than the current 
pid's directory
- for each pid, cache its creation time, and if you've already looked at that 
pid, don't look at it again in the next iteration of the plugin.

I imagine with the above optimizations we could get the overhead down to 1-2%.

As for the patch itself, the config naming isn't very clear. In my version I 
called the same config "mapred.task.calculate.resource.usage" -- to indicate 
that it's a boolean flag, not a class name.
                
> Resource calculation in child tasks is CPU-heavy
> ------------------------------------------------
>
>                 Key: MAPREDUCE-4469
>                 URL: https://issues.apache.org/jira/browse/MAPREDUCE-4469
>             Project: Hadoop Map/Reduce
>          Issue Type: Bug
>          Components: performance, task
>    Affects Versions: 1.0.3
>            Reporter: Todd Lipcon
>            Assignee: Ahmed Radwan
>         Attachments: MAPREDUCE-4469.patch
>
>
> In doing some benchmarking on a hadoop-1 derived codebase, I noticed that 
> each of the child tasks was doing a ton of syscalls. Upon stracing, I noticed 
> that it's spending a lot of time looping through all the files in /proc to 
> calculate resource usage.
> As a test, I added a flag to disable use of the ResourceCalculatorPlugin 
> within the tasks. On a CPU-bound 500G-sort workload, this improved total job 
> runtime by about 10% (map slot-seconds by 14%, reduce slot seconds by 8%)

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators: 
https://issues.apache.org/jira/secure/ContactAdministrators!default.jspa
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to