[
https://issues.apache.org/jira/browse/HADOOP-15549?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16518375#comment-16518375
]
Todd Lipcon commented on HADOOP-15549:
--------------------------------------
I ran a simple program which just calls DefaultMetricsSystem.initialize against
Hadoop 2.8.2 compared to 3.0.0 dist tarballs:
*2.8.2:
{code}
683.416696 task-clock (msec) # 1.793 CPUs utilized
( +- 2.32% )
1,790 context-switches # 0.003 M/sec
( +- 1.07% )
54 cpu-migrations # 0.080 K/sec
( +- 17.64% )
13,688 page-faults # 0.020 M/sec
( +- 0.54% )
2,216,866,739 cycles # 3.244 GHz
( +- 1.62% )
2,299,332,469 instructions # 1.04 insn per cycle
( +- 1.21% )
431,487,977 branches # 631.369 M/sec
( +- 1.17% )
19,346,551 branch-misses # 4.48% of all branches
( +- 1.07% )
0.381138028 seconds time elapsed
( +- 2.52% )
{code}
*3.0.0:*
{code}
924.881803 task-clock (msec) # 1.902 CPUs utilized
( +- 2.05% )
1,962 context-switches # 0.002 M/sec
( +- 0.73% )
44 cpu-migrations # 0.047 K/sec
( +- 11.15% )
20,593 page-faults # 0.022 M/sec
( +- 0.55% )
3,042,371,457 cycles # 3.289 GHz
( +- 1.67% )
3,165,586,053 instructions # 1.04 insn per cycle
( +- 1.41% )
592,945,118 branches # 641.104 M/sec
( +- 1.36% )
25,735,278 branch-misses # 4.34% of all branches
( +- 1.30% )
0.486354791 seconds time elapsed
( +- 2.04% )
{code}
Not all of the regression is due to the metrics system initialization, but with
a small patch that avoids the "builder" APIs, I can recover some of the
regression.
{code}
885.276567 task-clock (msec) # 2.009 CPUs utilized
( +- 1.45% )
1,608 context-switches # 0.002 M/sec
( +- 2.02% )
48 cpu-migrations # 0.055 K/sec
( +- 12.98% )
18,949 page-faults # 0.021 M/sec
( +- 0.88% )
2,908,533,684 cycles # 3.285 GHz
( +- 0.46% )
3,045,577,520 instructions # 1.05 insn per cycle
( +- 0.66% )
566,661,963 branches # 640.096 M/sec
( +- 0.67% )
24,309,912 branch-misses # 4.29% of all branches
( +- 0.77% )
0.440731241 seconds time elapsed
( +- 2.98% )
{code}
It also loads fewer classes (1651 vs 1768) by eliminating usage of 'beanutil'
and a bunch of ancillary classes in commons-configuration.
> Upgrade to commons-configuration 2.1 regresses task CPU consumption
> -------------------------------------------------------------------
>
> Key: HADOOP-15549
> URL: https://issues.apache.org/jira/browse/HADOOP-15549
> Project: Hadoop Common
> Issue Type: Bug
> Components: metrics
> Affects Versions: 3.0.2
> Reporter: Todd Lipcon
> Assignee: Todd Lipcon
> Priority: Major
>
> HADOOP-13660 upgraded from commons-configuration 1.x to 2.x.
> commons-configuration is used when parsing the metrics configuration
> properties file. The new builder API used in the new version apparently makes
> use of a bunch of very bloated reflection and classloading nonsense to
> achieve the same goal, and this results in a regression of >100ms of CPU time
> as measured by a program which simply initializes DefaultMetricsSystem.
> This isn't a big deal for long-running daemons, but for MR tasks which might
> only run a few seconds on poorly-tuned jobs, this can be noticeable.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]