[
https://issues.apache.org/jira/browse/LOG4J2-1120?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14742443#comment-14742443
]
Remko Popma edited comment on LOG4J2-1120 at 9/13/15 12:02 PM:
---------------------------------------------------------------
Benchmark results:
iterMapXElement iterates over a ConcurrentHashMap<String,Long>.values()
collection containing X elements, iterArraySet... iterates over a
CopyOnWriteArraySet<Long>. There are benchmarks for a single element and three
elements, since I assume that in most cases there will be one or at most a few
appenders.
*1 thread*
{code}
Benchmark Mode Samples
Score Error Units
o.a.l.l.p.j.LoggerConfigBenchmark.iterArraySet1Element sample 115366
34.429 ± 1.386 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterArraySet3Element sample 193159
31.368 ± 1.041 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterMap1Element sample 108869
79.615 ± 2.133 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterMap3Elements sample 184664
67.984 ± 1.660 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithCountersAndLock sample 103840
139.756 ± 2.643 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithCountersNoLocks sample 143068
102.707 ± 2.006 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithoutCountersOrLocks sample 172408
84.748 ± 1.723 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.testBaseline sample 101878
19.656 ± 1.136 ns/op
{code}
*4 threads*
{code}
Benchmark Mode Samples
Score Error Units
o.a.l.l.p.j.LoggerConfigBenchmark.iterArraySet1Element sample 487222
74.218 ± 0.938 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterArraySet3Element sample 391496
52.295 ± 0.915 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterMap1Element sample 456270
130.438 ± 1.790 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterMap3Elements sample 393511
137.019 ± 4.921 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithCountersAndLock sample 466650
387.175 ± 6.982 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithCountersNoLocks sample 583383
314.528 ± 2.917 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithoutCountersOrLocks sample 425858
236.417 ± 26.062 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.testBaseline sample 708875
55.932 ± 0.732 ns/op
{code}
Iterating over a CopyOnWriteArraySet is quite a bit faster than iterating over
ConcurrentHashMap.values(), both in single-threaded and multi-threaded
scenarios.
Avoiding lock acquisition is significantly faster, even if the lock is not
contended.
Finally, not updating the atomic counter can make a big difference and is worth
doing if it is safe to do so.
was (Author: [email protected]):
Benchmark results: (iterMapXElement iterates over a ConcurrentHashMap
containing X elements, iterArraySet... iterates over a CopyOnWriteArraySet)
*1 thread*
{code}
Benchmark Mode Samples
Score Error Units
o.a.l.l.p.j.LoggerConfigBenchmark.iterArraySet1Element sample 115366
34.429 ± 1.386 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterArraySet3Element sample 193159
31.368 ± 1.041 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterMap1Element sample 108869
79.615 ± 2.133 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterMap3Elements sample 184664
67.984 ± 1.660 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithCountersAndLock sample 103840
139.756 ± 2.643 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithCountersNoLocks sample 143068
102.707 ± 2.006 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithoutCountersOrLocks sample 172408
84.748 ± 1.723 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.testBaseline sample 101878
19.656 ± 1.136 ns/op
{code}
*4 threads*
{code}
Benchmark Mode Samples
Score Error Units
o.a.l.l.p.j.LoggerConfigBenchmark.iterArraySet1Element sample 487222
74.218 ± 0.938 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterArraySet3Element sample 391496
52.295 ± 0.915 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterMap1Element sample 456270
130.438 ± 1.790 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.iterMap3Elements sample 393511
137.019 ± 4.921 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithCountersAndLock sample 466650
387.175 ± 6.982 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithCountersNoLocks sample 583383
314.528 ± 2.917 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.logWithoutCountersOrLocks sample 425858
236.417 ± 26.062 ns/op
o.a.l.l.p.j.LoggerConfigBenchmark.testBaseline sample 708875
55.932 ± 0.732 ns/op
{code}
Iterating over a CopyOnWriteArraySet is roughly twice as fast as over
ConcurrentHashMap.values(), both in single-threaded and multi-threaded
scenarios.
Avoiding lock acquisition is faster, but not reading/updating the counters can
make it even faster.
> LoggerConfig performance improvements
> -------------------------------------
>
> Key: LOG4J2-1120
> URL: https://issues.apache.org/jira/browse/LOG4J2-1120
> Project: Log4j 2
> Issue Type: Improvement
> Components: Core
> Affects Versions: 2.3
> Reporter: Remko Popma
> Assignee: Remko Popma
> Fix For: 2.4
>
>
> As discussed on the mailing list, I can see several ways to improve the
> performance of the hot path in LoggerConfig.
> # Pulling up the {{shutdown.get()}} check into {{afterLogEvent()}} will avoid
> acquiring the shutdownLock unnecessarily - only acquiring the lock if a
> shutdown is actually in progress.
> # Another performance improvement can be made by changing the data structure
> that holds the AppenderControl objects from ConcurrentHashMap to
> CopyOnWriteArraySet. Modifications on this data structure are extremely rare,
> and the array-based collection has much better iteration performance.
> # Finally, since {{clearAppenders()}} is only called after all appenders have
> been stopped, {{waitForCompletion()}} may no longer be necessary (unless I am
> missing something here). If so, the {{shutdownLock}}, {{shutdown}} and
> {{counter}} fields can be removed. Not incrementing the atomic counters with
> every event in the hot path should give better performance.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]