[
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15388027#comment-15388027
]
Remko Popma commented on LOG4J2-1430:
-------------------------------------
John, Anthony, please take a look at the Log4j 2 [performance
page|https://logging.apache.org/log4j/2.x/performance.html] for an idea of what
I would like to see.
* Comparison of various queue options with a wide range of threads. I like
using 1, 2, 4, 8, 16, 32 and 64 threads. It does not need to be exactly this
number but this give a good idea of behaviour across a wide range of threads
without having to run too many perf tests. It is a good thing to include perf
results when using more threads than physical cores.
* Perf tests are done on "vanilla" configurations. We could do a comparison to
show how throughput/latency behaviour changes if you use spinlocks, thread
affinity or taskset but the main results should be without these. (I'm okay
with hyperthreading and power saving mode enabled or disabled as long as we
document what we did.) I realize this introduces noise that is not relevant to
the choice of queue, but more important to me is to avoid giving the impression
that users need to do things like thread affinity or taskset to achieve good
performance with Log4j 2.
Log4j is used in a wide variety of environments by a wide variety of
applications. There is no single "typical use case", which makes it hard for us
to tune it for optimal performance. The best we can do is show performance
behaviour under various configurations. Users will pick the result that gives
the best trade-off for their application and environment.
About testing with RandomAccessFileAppender: JMH does not allow us to control
the number of invocations, and it will always fill up the queue, after which
performance drops to the throughput of the RandomAccessFileAppender. It may be
interesting to show the difference of the various queues in this scenario, but
my assumption so far has been that this scenario is a rare one. Async logging
is only useful to absorb _bursts of events_. If the application's _sustained_
logging rate is faster than the underlying appender can keep up with, the
application is better off using plain _synchronous_ logging because any async
logger would just introduce jitter. For this reason (and after a long
discussion on this topic on the Mechanical Sympathy mailing list) the
comparison of async logging mechanisms tested with JMH is done with a
NoOpAppender. If nobody objects I suggest we stick with that.
> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---------------------------------------------------------------------------
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
> Issue Type: New Feature
> Components: Appenders
> Affects Versions: 2.6.1
> Reporter: Matt Sicker
> Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, AsyncLogBenchmarks.log,
> conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png,
> log4j2-1430-jctools-tmp-patch.txt, log4jHaswell2cpu2core.jpg,
> log4jHaswell2cpu4core.jpg, log4jrafile.log, log4jthread2cpu2core.log,
> log4jthread2cpu4core.log
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue.
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark Mode Samples
> Score Error Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params thrpt 20
> 1101267.173 ± 17583.204 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params thrpt 20
> 1128269.255 ± 12188.910 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param thrpt 20
> 1525470.805 ± 56515.933 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt 20
> 1789434.196 ± 42733.475 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt 20
> 1803276.278 ± 34938.176 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt 20
> 1468550.776 ± 26402.286 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt 20
> 1322304.349 ± 22417.997 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt 20
> 1179756.489 ± 16502.276 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt 20
> 1324660.677 ± 18893.944 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt 20
> 1309365.962 ± 19602.489 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt 20
> 1422144.180 ± 20815.042 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple thrpt 20
> 1247862.372 ± 18300.764 ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark Mode Samples
> Score Error Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params thrpt 20
> 3704735.586 ± 59766.253 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params thrpt 20
> 3622175.410 ± 31975.353 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param thrpt 20
> 6862480.428 ± 121473.276 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt 20
> 6193288.988 ± 93545.144 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt 20
> 5715621.712 ± 131878.581 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt 20
> 5745187.005 ± 213854.016 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt 20
> 5307137.396 ± 88135.709 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt 20
> 4953015.419 ± 72100.403 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt 20
> 4833836.418 ± 52919.314 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt 20
> 4353791.507 ± 79047.812 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt 20
> 4136761.624 ± 67804.253 ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple thrpt 20
> 6719456.722 ± 187433.301 ops/s
> {code}
> h3. AsyncLogger
> {code}
> Benchmark Mode Samples
> Score Error Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Params thrpt 20
> 5075883.371 ± 180465.316 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Params thrpt 20
> 4867362.030 ± 193909.465 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param thrpt 20
> 10294733.024 ± 226536.965 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params thrpt 20
> 9021650.667 ± 351102.255 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params thrpt 20
> 8079337.905 ± 115824.975 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput4Params thrpt 20
> 7347356.788 ± 66598.738 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput5Params thrpt 20
> 6930636.174 ± 150072.908 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput6Params thrpt 20
> 6309567.300 ± 293709.787 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput7Params thrpt 20
> 6051997.196 ± 268405.087 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput8Params thrpt 20
> 5273376.623 ± 99168.461 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput9Params thrpt 20
> 5091137.594 ± 150617.444 ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughputSimple thrpt 20
> 11136623.731 ± 400350.272 ops/s
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]