[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15345211#comment-15345211
 ] 

Anthony Maire commented on LOG4J2-1430:
---------------------------------------

The approch of using a NoOp appender for testing AsyncAppender isn't a bad idea 
at all. The "issue" is that it's easy to do a wrong interpretation of the 
result.

The issue is that we naturally expect the queue to be the bottleneck (and in 
fact it is the bottleneck in a lots of ABQ tests), but with those 
high-performance queue, the single-threaded code between queue.take() and 
appender.append() is the bottleneck. With 8 producer threads, I had the 
consumer thread 100% busy but producers were not. So we are measuring something 
that doesn't reflect the real world (since in real application we don't expect 
the queue to be always full), and this benchmark is not good to show which 
queue implementation is the best. 

I'm probably biaised because I'm working for a low-latency company, but the 
best metrics to evaluate the "performance"  of the AsyncAppender are latency, 
and latency is really damn hard to measure (cf this excellent talk from Gil 
Tene on this subject: https://www.infoq.com/presentations/latency-response-time 
). For such cases, I usually mesure latency percentiles at a load that is below 
maximal throughput (since all systems suck from a latency point of view when 
running near their saturation point) by recording  all individual results in a 
HdrHistogram. I don't think that is something that can be done with JMH, I 
usually write custom load generator to do this

A major caveat (among lots of others) with measuring latency is that the 
production environment may be very different from the idealized environment you 
have when doing benchmarks. Especially, the system might be overloaded (i.e 
there are more active threads in the whole machine, including kernel threads, 
than number of available core) and/or virtualized. In this situation the kind 
of concurrency (wait-free, lock-free ...) will matter a lot since threads can 
be suspended for several milliseconds

Showing this effect in a benchmark is hard, because it occurs at a very low 
frequency with "production-like" load and there are other causes of low 
frequency outliers. It may require lots of analysis. The usual way to show that 
an algorithm as poorer concurrency properties is to increase the throughput 
and/or the number of threads of the load generator until some candidates behave 
poorly: lock-free algorithm will scale better than blocking algorithms.

But with current AsyncAppender code, doing this kind of test is not possible, 
because the async thread isn't performant enough. To compare queues, the best 
thing is to compare them without being integrated into AsyncAppender. 
Otherwise, we need to optimise the AsyncAppender itself a lot to show 
differences between queues, but this might be a pretty useless effort since 
current code is probably efficient enough with a real underlying appender. 

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---------------------------------------------------------------------------
>
>                 Key: LOG4J2-1430
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-1430
>             Project: Log4j 2
>          Issue Type: New Feature
>          Components: Appenders
>    Affects Versions: 2.6.1
>            Reporter: Matt Sicker
>            Assignee: Matt Sicker
>             Fix For: 2.7
>
>         Attachments: AsyncAppenderPerf01.txt, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark                                                       Mode  Samples 
>        Score       Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params    thrpt       20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params    thrpt       20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param      thrpt       20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params     thrpt       20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params     thrpt       20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params     thrpt       20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params     thrpt       20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params     thrpt       20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params     thrpt       20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params     thrpt       20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params     thrpt       20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple      thrpt       20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark                                                       Mode  Samples 
>        Score        Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params    thrpt       20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params    thrpt       20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param      thrpt       20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params     thrpt       20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params     thrpt       20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params     thrpt       20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params     thrpt       20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params     thrpt       20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params     thrpt       20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params     thrpt       20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params     thrpt       20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple      thrpt       20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> Benchmark                                                Mode  Samples        
>  Score        Error  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Params    thrpt       20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Params    thrpt       20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param      thrpt       20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params     thrpt       20   
> 9021650.667 ± 351102.255  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params     thrpt       20   
> 8079337.905 ± 115824.975  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput4Params     thrpt       20   
> 7347356.788 ±  66598.738  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput5Params     thrpt       20   
> 6930636.174 ± 150072.908  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput6Params     thrpt       20   
> 6309567.300 ± 293709.787  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput7Params     thrpt       20   
> 6051997.196 ± 268405.087  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput8Params     thrpt       20   
> 5273376.623 ±  99168.461  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput9Params     thrpt       20   
> 5091137.594 ± 150617.444  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughputSimple      thrpt       20  
> 11136623.731 ± 400350.272  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org

Reply via email to