[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15338427#comment-15338427
 ] 

Anthony Maire commented on LOG4J2-1430:
---------------------------------------

I made some tests with the branch to compare Conversant DisruptorBlockingQueue 
and the MpscArrayQueue from JCTools ( https://github.com/JCTools/JCTools ). 
Since the JCTool queue isn't a BlockingQueue, I extended it to add the 
BlockingQueue methods, using LockSupport.parkNanos(1L) when waiting was needed

I first ran the JMH benchmark on my machine (Windows 7 on core i5-2500K, 4 
cores, no HT) with 8 and 16 threads and there were no noticeable difference 
between both alternative implementations

Then since the LockSupport.parkNanos(1L) behavior is really dependant on the OS 
(cf 
http://vanillajava.blogspot.fr/2012/04/yield-sleep0-wait01-and-parknanos1.html 
), I made some test on a Linux server : CentOS 6.7 with dual Xeon E5-2643 v3 
(24 hyperthreads for the whole machine), the parkNanos pause was around 50µs 
instead of 1ms on windows

There was no noticeable difference up to 8 threads, when the number of thread 
increase, the DisruptorBlockingQueue was performing bad, and has no clear 
advantage on ArrayBlockingQueue for 36 threads. the MPSC queue has a constant 
throughput

Here are the rough figure I had in the following order ABQ, Conversant 
Disruptor and JCTools MPSC :
{noformat}
1 thread  : 925k 3M 3.15M
8 threads : 600k 2.95M 3.15M
12 threads : 550k 2.2M 3.10M
16 threads : 500K 1.9M 3.6M
24 threads : 600K 650K 3.45M
36 threads : 530K 600K 3.5M
{noformat}

I assume than the high pause time from parkNanos on windows (used by conversant 
wait strategy too) was artificially reducing contention between producers and 
allow it to perform pretty well.

When I compare latency profiles of both implementation (with JMH "SampleTime" 
mode) for 8 threads, JCTool perform better at higher percentile (the 99.9999% 
cannot be trust because the number of samples is way too low)

{noformat}
Conversant Disruptor results :

Result: 2761.833 ±(99.9%) 26.908 ns/op [Average]
  Statistics: (min, avg, max) = (344.000, 2761.833, 16039936.000), stdev = 
11613.072
  Confidence interval (99.9%): [2734.925, 2788.741]
  Samples, N = 2016821
        mean =   2761.833 ±(99.9%) 26.908 ns/op
         min =    344.000 ns/op
  p( 0.0000) =    344.000 ns/op
  p(50.0000) =   2148.000 ns/op
  p(90.0000) =   5024.000 ns/op
  p(95.0000) =   6728.000 ns/op
  p(99.0000) =  11440.000 ns/op
  p(99.9000) =  18912.000 ns/op
  p(99.9900) =  27328.000 ns/op
  p(99.9990) =  35509.234 ns/op
  p(99.9999) = 2346156.268 ns/op
         max = 16039936.000 ns/op



JCTools MPSC results

Result: 3115.571 ±(99.9%) 23.081 ns/op [Average]
  Statistics: (min, avg, max) = (325.000, 3115.571, 6905856.000), stdev = 
9438.165
  Confidence interval (99.9%): [3092.490, 3138.651]
  Samples, N = 1810522
        mean =   3115.571 ±(99.9%) 23.081 ns/op
         min =    325.000 ns/op
  p( 0.0000) =    325.000 ns/op
  p(50.0000) =   2984.000 ns/op
  p(90.0000) =   4352.000 ns/op
  p(95.0000) =   4816.000 ns/op
  p(99.0000) =   5736.000 ns/op
  p(99.9000) =   6872.000 ns/op
  p(99.9900) =   7976.000 ns/op
  p(99.9990) =  10652.653 ns/op
  p(99.9999) = 6872656.978 ns/op
         max = 6905856.000 ns/op
{noformat}

Interesting fact: with 16 to 36 threads, top reported a cpu usage for the 
JCtool queue that doesn't exceed 1200% (when Conversant queue was using the 
full CPU), so half of the cpu cycles of the machine where not used. I think 
there is something that can be improved, probably around my BlockingQueue quick 
and dirty implementation.

Anyway, I'm really not sure that using Conversant DisruptorBlockingQueue is a 
good alternative to ABQ.



> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---------------------------------------------------------------------------
>
>                 Key: LOG4J2-1430
>                 URL: https://issues.apache.org/jira/browse/LOG4J2-1430
>             Project: Log4j 2
>          Issue Type: New Feature
>          Components: Appenders
>    Affects Versions: 2.6.1
>            Reporter: Matt Sicker
>            Assignee: Matt Sicker
>             Fix For: 2.7
>
>         Attachments: AsyncAppenderPerf01.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark                                                       Mode  Samples 
>        Score       Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params    thrpt       20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params    thrpt       20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param      thrpt       20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params     thrpt       20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params     thrpt       20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params     thrpt       20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params     thrpt       20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params     thrpt       20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params     thrpt       20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params     thrpt       20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params     thrpt       20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple      thrpt       20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark                                                       Mode  Samples 
>        Score        Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Params    thrpt       20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Params    thrpt       20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param      thrpt       20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params     thrpt       20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params     thrpt       20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params     thrpt       20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params     thrpt       20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params     thrpt       20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params     thrpt       20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params     thrpt       20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params     thrpt       20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple      thrpt       20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> Benchmark                                                Mode  Samples        
>  Score        Error  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Params    thrpt       20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Params    thrpt       20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param      thrpt       20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params     thrpt       20   
> 9021650.667 ± 351102.255  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params     thrpt       20   
> 8079337.905 ± 115824.975  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput4Params     thrpt       20   
> 7347356.788 ±  66598.738  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput5Params     thrpt       20   
> 6930636.174 ± 150072.908  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput6Params     thrpt       20   
> 6309567.300 ± 293709.787  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput7Params     thrpt       20   
> 6051997.196 ± 268405.087  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput8Params     thrpt       20   
> 5273376.623 ±  99168.461  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput9Params     thrpt       20   
> 5091137.594 ± 150617.444  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughputSimple      thrpt       20  
> 11136623.731 ± 400350.272  ops/s
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org

Reply via email to