[
https://issues.apache.org/jira/browse/LOG4J2-928?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16075093#comment-16075093
]
ASF GitHub Bot commented on LOG4J2-928:
---------------------------------------
Github user garydgregory commented on the issue:
https://github.com/apache/logging-log4j2/pull/87
Thank you for running the benchmarks. Is this worth doing over on the
_current_ Java 8 runtime (_131)?
Gary
On Wed, Jul 5, 2017 at 6:59 AM, Remko Popma <[email protected]>
wrote:
> As promised, here are the throughput results on enterprise hardware.
>
> Hardware: total 64 cores (4x 16-core) Intel(R) Xeon(R) CPU E5-2697A v4 @
> 2.60GHz
> Java Hotspot 64-bit server 1.8.0_112
>
> Results look good: as soon as we introduce concurrency, the lock-free MMap
> appender starts outperforming the other file appenders.
>
> *1 thread*
>
> {code}
> -sh-4.2$ java -jar benchmarks.jar ".*Log4j2AppenderThroughput.*" -wi 5 -f
1 -i 10
>
> Benchmark (loggerType)
(messageSizeBytes) Mode Samples Score Error Units
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender File
150 thrpt 10 1344494.961 ± 221110.545 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender RAF
150 thrpt 10 1563396.748 ± 12497.136 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender MMap
150 thrpt 10 1223215.440 ± 14661.487 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end File
150 thrpt 10 1008537.505 ± 9213.181 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end RAF
150 thrpt 10 998564.075 ± 12207.661 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end MMap
150 thrpt 10 785114.881 ± 8827.768 ops/s
> {code}
>
> *4 threads*
>
> {code}
> -sh-4.2$ java -jar benchmarks.jar ".*Log4j2AppenderThroughput.*" -wi 5 -f
1 -i 10 -t 4
>
> Benchmark (loggerType)
(messageSizeBytes) Mode Samples Score Error Units
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender File
150 thrpt 10 1678485.793 ± 260632.925 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender RAF
150 thrpt 10 1412574.559 ± 447410.040 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender MMap
150 thrpt 10 1386414.915 ± 333117.992 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end File
150 thrpt 10 959662.994 ± 333541.548 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end RAF
150 thrpt 10 929405.390 ± 66673.436 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end MMap
150 thrpt 10 1013681.438 ± 155916.949 ops/s
> {code}
>
> *8 threads*
>
> {code}
> -sh-4.2$ java -jar benchmarks.jar ".*Log4j2AppenderThroughput.*" -wi 5 -f
1 -i 10 -t 8
>
> Benchmark (loggerType)
(messageSizeBytes) Mode Samples Score Error Units
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender File
150 thrpt 10 1164538.492 ± 62482.965 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender RAF
150 thrpt 10 1194865.200 ± 72251.118 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender MMap
150 thrpt 10 1338960.644 ± 71104.991 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end File
150 thrpt 10 944670.900 ± 69190.255 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end RAF
150 thrpt 10 930248.735 ± 28523.086 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end MMap
150 thrpt 10 998676.286 ± 36327.805 ops/s
> {code}
>
> *16 threads*
>
> {code}
> -sh-4.2$ java -jar benchmarks.jar ".*Log4j2AppenderThroughput.*" -wi 5 -f
1 -i 10 -t 16
>
> Benchmark (loggerType)
(messageSizeBytes) Mode Samples Score Error Units
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender File
150 thrpt 10 1214786.965 ± 7217.940 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender RAF
150 thrpt 10 1192036.036 ± 13696.830 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender MMap
150 thrpt 10 1548961.297 ± 27141.206 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end File
150 thrpt 10 1003235.059 ± 10365.921 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end RAF
150 thrpt 10 994834.287 ± 9057.470 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end MMap
150 thrpt 10 1323043.186 ± 19960.127 ops/s
> {code}
>
> *32 threads*
>
> {code}
> -sh-4.2$ java -jar benchmarks.jar ".*Log4j2AppenderThroughput.*" -wi 5 -f
1 -i 10 -t 32
>
> Benchmark (loggerType)
(messageSizeBytes) Mode Samples Score Error Units
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender File
150 thrpt 10 1372729.015 ± 40762.524 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender RAF
150 thrpt 10 1185351.707 ± 16992.542 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender MMap
150 thrpt 10 1784377.615 ± 179359.793 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end File
150 thrpt 10 918029.971 ± 38434.750 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end RAF
150 thrpt 10 1122310.566 ± 152840.084 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end MMap
150 thrpt 10 1898762.678 ± 81312.274 ops/s
> {code}
>
> *64 threads*
>
> {code}
> -sh-4.2$ java -jar benchmarks.jar ".*Log4j2AppenderThroughput.*" -wi 5 -f
1 -i 10 -t 64
>
> Benchmark (loggerType)
(messageSizeBytes) Mode Samples Score Error Units
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender File
150 thrpt 10 1193371.846 ± 32893.485 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender RAF
150 thrpt 10 1231136.529 ± 26895.411 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.appender MMap
150 thrpt 10 1409277.639 ± 91900.204 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end File
150 thrpt 10 881261.726 ± 35023.552 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end RAF
150 thrpt 10 929691.738 ± 22616.873 ops/s
> o.a.l.l.p.j.Log4j2AppenderThroughputBenchmark.end2end MMap
150 thrpt 10 1216108.561 ± 41321.530 ops/s
> {code}
>
> —
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub
> <https://github.com/apache/logging-log4j2/pull/87#issuecomment-313111330>,
> or mute the thread
>
<https://github.com/notifications/unsubscribe-auth/ABIfN9oh2p20v8fnX4yBVzi-pzelEgr1ks5sK5bPgaJpZM4OEhNj>
> .
>
> Lock-free synchronous sub-microsecond appender
> ----------------------------------------------
>
> Key: LOG4J2-928
> URL: https://issues.apache.org/jira/browse/LOG4J2-928
> Project: Log4j 2
> Issue Type: New Feature
> Components: Appenders
> Reporter: Remko Popma
>
> _(This is a work in progress.)_
> *Goal*
> It should be possible to create synchronous file appenders with (nearly) the
> same performance as async loggers.
> *Background*
> The key to the async loggers performance is the lock-free queue provided by
> the LMAX Disruptor. Multiple threads can add events to this queue without
> contending on a lock. This means throughput scales linearly with the number
> of threads: more threads = more throughput.
> With a lock-based design, on the other hand, performance maxes out with a
> single thread logging. Adding more threads does not help. In fact, total
> logging throughput goes down slightly when multiple threads are logging (see
> the Async Appenders in the [async performance
> comparison|http://logging.apache.org/log4j/2.x/manual/async.html#Asynchronous_Throughput_Comparison_with_Other_Logging_Packages]).
> Lock contention means that multiple threads together end up logging slower
> than a single thread.
> *Currently only async loggers are lock-free*
> Log4j2 provides good performance with async loggers, but this approach has
> several drawbacks:
> * dependency on external LMAX disruptor library
> * possibility of data loss: log events that have been put on the queue but
> not flushed to disk yet may be lost in the event of an application crash
> This ticket proposes a new feature to address these issues.
> *Proposal: a lock-free synchronous appender*
> For a single-threaded application the current MemoryMappedFileAppender has
> performance comparable to Async Loggers (TODO: perf test).
> However, the current implementation uses locks to control concurrency, and
> suffers from lock contention in multi-threaded scenarios.
> For inspiration for a lock-free solution, we can look at
> [Aeron|https://github.com/real-logic/Aeron], specifically Aeron's design for
> Log Buffers. Martin Thompson's September 2014 Strangeloop
> [presentation|https://www.youtube.com/watch?v=tM4YskS94b0] gives details on
> the design (especially the section 16:45-23:30 is relevant).
> The way this works, is that instead of using locks, concurrency is handled
> with a protocol where threads "reserve" blocks of memory atomically. Each
> thread (having serialized the log event) knows how many bytes it wants to
> write. It then atomically moves the buffer tail pointer by that many bytes
> using a CAS operation. After the tail has been moved, the thread is free to
> write the message payload bytes to the area of the buffer that it just
> reserved, without needing to worry about other threads. Between threads, the
> only point of contention is the tail pointer, which is similar to the
> disruptor. We can reasonably expect performance to scale linearly with the
> number threads, like async loggers.
> *Still needs work*
> This looks promising, but there are a few snags.
> # Needs the Unsafe. {{java.nio.ByteBuffer}} only provides relative, not
> absolute bulk put operations. That is, it only allows appending byte arrays
> at the current cursor location, not at some user-specified absolute location.
> The above design requires random access to be thread-safe. Aeron works around
> this by using {{sun.misc.Unsafe}}. Users should be aware of this so they can
> decide on whether the performance gain is worth the risk. Also, this may make
> the OSGi folks unhappy (see LOG4J2-238 discussion)... Not sure how serious we
> are about making Log4j2 work on OSGi, but perhaps it is possible to mark the
> package for this feature as optional in the OSGi manifest. An alternative may
> be to put this appender in a separate module.
> # TBD: How many files/buffers to use? In his presentation Martin mentions
> that using a single large memory mapped file will cause a lot of page faults,
> page cache churn, and unspecified VM issues. He recommends cycling between
> three smaller buffers, one active (currently written to), one dirty (full,
> now being processed by a background thread) and one clean (to swap in when
> the active buffer becomes full). I am not sure if the page fault problem will
> occur for our use case: a Log4j appender is append-only, and there is no
> separate thread or process reading this data at the same time. If it does,
> and we decide on a similar design with three smaller buffers, we still need
> to work out if these can be three different mapped regions in the same log
> file, or if it is better to use a separate temporary file and copy from the
> temporary file to the target log file in a background thread. I would prefer
> to have a single file. Note that even with a single file we may want a
> background thread for mapping a new region at every swap and occasionally
> extending the file when the EOF is reached.
> Feedback welcome. I intend to update this ticket as I learn more.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)