[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15387588#comment-15387588
 ] 

Anthony Maire commented on LOG4J2-1430:
---------------------------------------

Hello John

Thank you for all these benchmark, but I'm afraid they are not really relevant. 
Can you explain why you constrained the JVM to run on only 2 or 4 cores, with 
QPI transfers involved? I cannot see the point. As soon as the number of 
threads is high enough, we expect the queue to be mostly full, so for the 
non-JDK queues we will have a lot of threads spinning on very few cores, which 
is really really bad.

Can you either :
- use park based strategy for queues (i.e WAITING for Conversant and the 
default PARK for JCTools) if this kind of taskset setup made sense to your main 
use cases : I cannot imagine an application with 10 times more active threads 
than available core that want to use a spin or yield-based strategy ....
- don't constraint the JVM to let it use the 24 cores available on your box
- constraint the JVM to a whole NUMA node (using numactl -N 1 -m 1 java -jar 
....) but don't run more than 11 spinning producer threads.

Moreover in such heavily contented scenario, there is usually a throughput vs 
latency tradeoff (if some threads are frozen, contention is released and 
overall throughput increases), I'm pretty sure that running the same benchmark 
with JMH "sample time" mode will show that JCTools queue latency is lower and 
more stable than the Conversant one. That's why in the 2 core scenario you have 
JCTools queue that perform worse than ABQ when thread count increases : this 
kind of think cannot occurs with realistic loads (i.e with no more spinning 
threads than available core, or with a park-based waiting strategy)

Kind regards,

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---------------------------------------------------------------------------
>
>                 Key: LOG4J2-1430
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-1430
>             Project: Log4j 2
>          Issue Type: New Feature
>          Components: Appenders
>    Affects Versions: 2.6.1
>            Reporter: Matt Sicker
>            Assignee: Matt Sicker
>             Fix For: 2.7
>
>         Attachments: AsyncAppenderPerf01.txt, AsyncLogBenchmarks.log, 
> conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png, 
> log4j2-1430-jctools-tmp-patch.txt, log4jHaswell2cpu2core.jpg, 
> log4jHaswell2cpu4core.jpg, log4jrafile.log, log4jthread2cpu2core.log, 
> log4jthread2cpu4core.log
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark                                                       Mode  Samples 
>        Score       Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params    thrpt       20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params    thrpt       20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param      thrpt       20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params     thrpt       20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params     thrpt       20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params     thrpt       20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params     thrpt       20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params     thrpt       20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params     thrpt       20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params     thrpt       20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params     thrpt       20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple      thrpt       20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark                                                       Mode  Samples 
>        Score        Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params    thrpt       20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params    thrpt       20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param      thrpt       20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params     thrpt       20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params     thrpt       20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params     thrpt       20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params     thrpt       20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params     thrpt       20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params     thrpt       20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params     thrpt       20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params     thrpt       20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple      thrpt       20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> Benchmark                                                Mode  Samples        
>  Score        Error  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Params    thrpt       20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Params    thrpt       20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param      thrpt       20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params     thrpt       20   
> 9021650.667 ± 351102.255  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params     thrpt       20   
> 8079337.905 ± 115824.975  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput4Params     thrpt       20   
> 7347356.788 ±  66598.738  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput5Params     thrpt       20   
> 6930636.174 ± 150072.908  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput6Params     thrpt       20   
> 6309567.300 ± 293709.787  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput7Params     thrpt       20   
> 6051997.196 ± 268405.087  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput8Params     thrpt       20   
> 5273376.623 ±  99168.461  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput9Params     thrpt       20   
> 5091137.594 ± 150617.444  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughputSimple      thrpt       20  
> 11136623.731 ± 400350.272  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org

Reply via email to