[ https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387588#comment-15387588 ]
Anthony Maire commented on LOG4J2-1430: --------------------------------------- Hello John Thank you for all these benchmark, but I'm afraid they are not really relevant. Can you explain why you constrained the JVM to run on only 2 or 4 cores, with QPI transfers involved? I cannot see the point. As soon as the number of threads is high enough, we expect the queue to be mostly full, so for the non-JDK queues we will have a lot of threads spinning on very few cores, which is really really bad. Can you either : - use park based strategy for queues (i.e WAITING for Conversant and the default PARK for JCTools) if this kind of taskset setup made sense to your main use cases : I cannot imagine an application with 10 times more active threads than available core that want to use a spin or yield-based strategy .... - don't constraint the JVM to let it use the 24 cores available on your box - constraint the JVM to a whole NUMA node (using numactl -N 1 -m 1 java -jar ....) but don't run more than 11 spinning producer threads. Moreover in such heavily contented scenario, there is usually a throughput vs latency tradeoff (if some threads are frozen, contention is released and overall throughput increases), I'm pretty sure that running the same benchmark with JMH "sample time" mode will show that JCTools queue latency is lower and more stable than the Conversant one. That's why in the 2 core scenario you have JCTools queue that perform worse than ABQ when thread count increases : this kind of think cannot occurs with realistic loads (i.e with no more spinning threads than available core, or with a park-based waiting strategy) Kind regards, > Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender > --------------------------------------------------------------------------- > > Key: LOG4J2-1430 > URL: https://issues.apache.org/jira/browse/LOG4J2-1430 > Project: Log4j 2 > Issue Type: New Feature > Components: Appenders > Affects Versions: 2.6.1 > Reporter: Matt Sicker > Assignee: Matt Sicker > Fix For: 2.7 > > Attachments: AsyncAppenderPerf01.txt, AsyncLogBenchmarks.log, > conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png, > log4j2-1430-jctools-tmp-patch.txt, log4jHaswell2cpu2core.jpg, > log4jHaswell2cpu4core.jpg, log4jrafile.log, log4jthread2cpu2core.log, > log4jthread2cpu4core.log > > > [Conversant Disruptor|https://github.com/conversant/disruptor] works as an > implementation of BlockingQueue that is much faster than ArrayBlockingQueue. > I did some benchmarks earlier and found it to be a bit faster: > h3. AsyncAppender/ArrayBlockingQueue > {code} > Benchmark Mode Samples > Score Error Units > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params thrpt 20 > 1101267.173 ± 17583.204 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params thrpt 20 > 1128269.255 ± 12188.910 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param thrpt 20 > 1525470.805 ± 56515.933 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt 20 > 1789434.196 ± 42733.475 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt 20 > 1803276.278 ± 34938.176 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt 20 > 1468550.776 ± 26402.286 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt 20 > 1322304.349 ± 22417.997 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt 20 > 1179756.489 ± 16502.276 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt 20 > 1324660.677 ± 18893.944 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt 20 > 1309365.962 ± 19602.489 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt 20 > 1422144.180 ± 20815.042 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple thrpt 20 > 1247862.372 ± 18300.764 ops/s > {code} > h3. AsyncAppender/DisruptorBlockingQueue > {code} > Benchmark Mode Samples > Score Error Units > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params thrpt 20 > 3704735.586 ± 59766.253 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params thrpt 20 > 3622175.410 ± 31975.353 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param thrpt 20 > 6862480.428 ± 121473.276 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt 20 > 6193288.988 ± 93545.144 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt 20 > 5715621.712 ± 131878.581 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt 20 > 5745187.005 ± 213854.016 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt 20 > 5307137.396 ± 88135.709 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt 20 > 4953015.419 ± 72100.403 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt 20 > 4833836.418 ± 52919.314 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt 20 > 4353791.507 ± 79047.812 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt 20 > 4136761.624 ± 67804.253 ops/s > o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple thrpt 20 > 6719456.722 ± 187433.301 ops/s > {code} > h3. AsyncLogger > {code} > Benchmark Mode Samples > Score Error Units > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Params thrpt 20 > 5075883.371 ± 180465.316 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Params thrpt 20 > 4867362.030 ± 193909.465 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param thrpt 20 > 10294733.024 ± 226536.965 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params thrpt 20 > 9021650.667 ± 351102.255 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params thrpt 20 > 8079337.905 ± 115824.975 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput4Params thrpt 20 > 7347356.788 ± 66598.738 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput5Params thrpt 20 > 6930636.174 ± 150072.908 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput6Params thrpt 20 > 6309567.300 ± 293709.787 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput7Params thrpt 20 > 6051997.196 ± 268405.087 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput8Params thrpt 20 > 5273376.623 ± 99168.461 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughput9Params thrpt 20 > 5091137.594 ± 150617.444 ops/s > o.a.l.l.p.j.AsyncLoggersBenchmark.throughputSimple thrpt 20 > 11136623.731 ± 400350.272 ops/s > {code} -- This message was sent by Atlassian JIRA (v6.3.4#6332) --------------------------------------------------------------------- To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org For additional commands, e-mail: log4j-dev-h...@logging.apache.org