[jira] [Commented] (LOG4J2-1855) Add an optional random delay in TimeBasedTriggeringPolicy

2017-03-23 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15938357#comment-15938357
 ] 

Anthony Maire commented on LOG4J2-1855:
---

Pull Request : https://github.com/apache/logging-log4j2/pull/68

> Add an optional random delay in TimeBasedTriggeringPolicy
> -
>
> Key: LOG4J2-1855
> URL: https://issues.apache.org/jira/browse/LOG4J2-1855
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.8.1
>Reporter: Anthony Maire
>
> When there are lots of JVM on the same physical host configured to rolling 
> their log file with a TimeBasedTriggeringPolicy, it will create a temporary 
> load that can impact the behavior of the applications since all JVM will roll 
> and compress the old file at the same time
> The purpose of this enhancement is to add an additional property on 
> TimeBasedTriggeringPolicy to delay the rolling by a random amount of time to 
> avoid that massive simultaneous rolling.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-07-28 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397718#comment-15397718
 ] 

Anthony Maire commented on LOG4J2-1430:
---

Running up to 10 more spinning threads than available core can NEVER be 
relevant, it's just like trying to mesure the speed of a car without releasing 
handbrake.

For the "simulating a desktop system" and the "perf enthusiasts use taskset" 
parts, the issue is that your using cores from different CPUs : in a desktop 
system, cores won't never communicate through a QPI link, and perf enthusiasts 
will try to exploit NUMA locality as much as possible, the taskset setting you 
used is a counter-productive tuning since it will cause lots of cross-node 
memory access

However, as I said in another comment, these throughput test are not really 
relevant, I will try to run some other tests in a near future.

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, AsyncLogBenchmarks.log, 
> conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png, 
> log4j2-1430-jctools-tmp-patch.txt, log4jHaswell2cpu2core.jpg, 
> log4jHaswell2cpu4core.jpg, log4jrafile.log, log4jthread2cpu2core.log, 
> log4jthread2cpu4core.log
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-07-28 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15397704#comment-15397704
 ] 

Anthony Maire commented on LOG4J2-1430:
---

Hi Remko

I had a look on the performance page, the tests included in this one are far 
more relevant that the JMH throughput test that measure something that will 
never exist in real world. I will definitely try to run this one with the 
various queue implementations when I will have some time (but I'm currently 
pretty busy). However I'm very surprised with the figures for AsyncLogger : how 
can the throughput still double when thread count double but there are already 
as much thread as available core/hyperthreads ? I suspect there is something 
wrong there. Did you made some investigation on that point ?

About the tests with the RAF appender, I made some further test using the 
rolling variant, with maximum 1 file, and all of that writing to /dev/shm to 
avoid real fs/disk jitter, the maximum rate I had was around 900K msg/s, very 
far from the 3 to 7M msg/s I had with a no-op appender with custom queues. 
Since the JMH test only mesure how fast the appender thread can dequeue (cf a 
huge increase with JCTools queue when I changed the offer method to avoid false 
sharing between producer and consumer), doing comparison on it are really 
pointless. The degree of contention in real world will be really lower, even 
when it is not blocking on disk

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, AsyncLogBenchmarks.log, 
> conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png, 
> log4j2-1430-jctools-tmp-patch.txt, log4jHaswell2cpu2core.jpg, 
> log4jHaswell2cpu4core.jpg, log4jrafile.log, log4jthread2cpu2core.log, 
> log4jthread2cpu4core.log
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> 

[jira] [Comment Edited] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-07-21 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387588#comment-15387588
 ] 

Anthony Maire edited comment on LOG4J2-1430 at 7/21/16 3:02 PM:


Hello John

Thank you for all these benchmark, but I'm afraid they are not really relevant. 
Can you explain why you constrained the JVM to run on only 2 or 4 cores, with 
QPI transfers involved? I cannot see the point. As soon as the number of 
threads is high enough, we expect the queue to be mostly full, so for the 
non-JDK queues we will have a lot of threads spinning on very few cores, which 
is really really bad.

Can you either :
- use park based strategy for queues (i.e WAITING for Conversant and the 
default PARK for JCTools) if this kind of taskset setup made sense to your main 
use cases : I cannot imagine an application with 10 times more active threads 
than available core that want to use a spin or yield-based strategy 
- don't constraint the JVM to let it use the 24 cores available on your box
- constraint the JVM to a whole NUMA node (using numactl -N 1 -m 1 java -jar 
) but don't run more than 11 spinning producer threads.

Moreover in such heavily contented scenario, there is usually a throughput vs 
latency tradeoff (if some threads are frozen, contention is released and 
overall throughput increases), I'm pretty sure that running the same benchmark 
with JMH "sample time" mode will show that JCTools queue latency is lower and 
more stable than the Conversant one. That's why in the 2 core scenario you have 
JCTools queue that perform worse than ABQ when thread count increases : this 
kind of thing cannot occur with realistic loads (i.e with no more spinning 
threads than available core, or with a park-based waiting strategy)

Kind regards,


was (Author: anthony maire):
Hello John

Thank you for all these benchmark, but I'm afraid they are not really relevant. 
Can you explain why you constrained the JVM to run on only 2 or 4 cores, with 
QPI transfers involved? I cannot see the point. As soon as the number of 
threads is high enough, we expect the queue to be mostly full, so for the 
non-JDK queues we will have a lot of threads spinning on very few cores, which 
is really really bad.

Can you either :
- use park based strategy for queues (i.e WAITING for Conversant and the 
default PARK for JCTools) if this kind of taskset setup made sense to your main 
use cases : I cannot imagine an application with 10 times more active threads 
than available core that want to use a spin or yield-based strategy 
- don't constraint the JVM to let it use the 24 cores available on your box
- constraint the JVM to a whole NUMA node (using numactl -N 1 -m 1 java -jar 
) but don't run more than 11 spinning producer threads.

Moreover in such heavily contented scenario, there is usually a throughput vs 
latency tradeoff (if some threads are frozen, contention is released and 
overall throughput increases), I'm pretty sure that running the same benchmark 
with JMH "sample time" mode will show that JCTools queue latency is lower and 
more stable than the Conversant one. That's why in the 2 core scenario you have 
JCTools queue that perform worse than ABQ when thread count increases : this 
kind of think cannot occurs with realistic loads (i.e with no more spinning 
threads than available core, or with a park-based waiting strategy)

Kind regards,

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, AsyncLogBenchmarks.log, 
> conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png, 
> log4j2-1430-jctools-tmp-patch.txt, log4jHaswell2cpu2core.jpg, 
> log4jHaswell2cpu4core.jpg, log4jrafile.log, log4jthread2cpu2core.log, 
> log4jthread2cpu4core.log
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param   

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-07-21 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15387588#comment-15387588
 ] 

Anthony Maire commented on LOG4J2-1430:
---

Hello John

Thank you for all these benchmark, but I'm afraid they are not really relevant. 
Can you explain why you constrained the JVM to run on only 2 or 4 cores, with 
QPI transfers involved? I cannot see the point. As soon as the number of 
threads is high enough, we expect the queue to be mostly full, so for the 
non-JDK queues we will have a lot of threads spinning on very few cores, which 
is really really bad.

Can you either :
- use park based strategy for queues (i.e WAITING for Conversant and the 
default PARK for JCTools) if this kind of taskset setup made sense to your main 
use cases : I cannot imagine an application with 10 times more active threads 
than available core that want to use a spin or yield-based strategy 
- don't constraint the JVM to let it use the 24 cores available on your box
- constraint the JVM to a whole NUMA node (using numactl -N 1 -m 1 java -jar 
) but don't run more than 11 spinning producer threads.

Moreover in such heavily contented scenario, there is usually a throughput vs 
latency tradeoff (if some threads are frozen, contention is released and 
overall throughput increases), I'm pretty sure that running the same benchmark 
with JMH "sample time" mode will show that JCTools queue latency is lower and 
more stable than the Conversant one. That's why in the 2 core scenario you have 
JCTools queue that perform worse than ABQ when thread count increases : this 
kind of think cannot occurs with realistic loads (i.e with no more spinning 
threads than available core, or with a park-based waiting strategy)

Kind regards,

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, AsyncLogBenchmarks.log, 
> conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png, 
> log4j2-1430-jctools-tmp-patch.txt, log4jHaswell2cpu2core.jpg, 
> log4jHaswell2cpu4core.jpg, log4jrafile.log, log4jthread2cpu2core.log, 
> log4jthread2cpu4core.log
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-07-14 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376524#comment-15376524
 ] 

Anthony Maire commented on LOG4J2-1430:
---

LinkedBlockingQueue should be considered I think, it has better concurrency 
property than ABQ, but it can be bounded. It is basically a good tradeoff 
between ABQ and LTQ

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> BenchmarkMode  Samples
>  ScoreError  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Paramsthrpt   20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Paramsthrpt   20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param  thrpt   20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params thrpt   20   
> 9021650.667 ± 351102.255  ops/s
> 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-07-14 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15376500#comment-15376500
 ] 

Anthony Maire commented on LOG4J2-1430:
---

If you have enough time (I don't have access for a while on the Haswell box I 
was using earlier), can you test what is the max throughput that a real 
appender can sustain. Can you please change the NoOpAppender used in the test 
to a RandomAccessFile appender with immediateFlush=false (I think that is the 
appender that offer the best throughput currently) with 8 threads and either 
JCTools or Conversant queue

Basically if all queue implementation can deliver more throughput that a real 
underlying appender can sustain, comparing the throughput of the queues does 
not make any sense, and only other characteristic such as offer latency and GC 
pressure really matters.

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> conversantvsjctoolsnumthreads.jpg, jctools-vs-conversant-service-time.png, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> Benchmark 

[jira] [Commented] (LOG4J2-1439) Add optional support for JCTools in AsyncAppender

2016-06-30 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15357521#comment-15357521
 ] 

Anthony Maire commented on LOG4J2-1439:
---

Sorry, I forget to include the pom when generating the patch.  You need version 
1.2.1

> Add optional support for JCTools in AsyncAppender
> -
>
> Key: LOG4J2-1439
> URL: https://issues.apache.org/jira/browse/LOG4J2-1439
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: LOG4J2-1439.patch.txt
>
>
> Spawning from the discussion in LOG4J2-1430, this feature request is to add 
> support for using JCTools MPSC bounded lock-free queue in AsyncAppender.
> As the work for this issue stems from the work in 1430, the code is all 
> contained in the same branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Updated] (LOG4J2-1439) Add optional support for JCTools in AsyncAppender

2016-06-30 Thread Anthony Maire (JIRA)

 [ 
https://issues.apache.org/jira/browse/LOG4J2-1439?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Maire updated LOG4J2-1439:
--
Attachment: LOG4J2-1439.patch.txt

- Support for several wait strategies
- Always keep some empty space in the queue to avoid the producer thread to 
suffer from false sharing when the queue is full
- Add a queue latency test to check performance without using AsyncAppender 
which is often hiding the queue performance issues. 

> Add optional support for JCTools in AsyncAppender
> -
>
> Key: LOG4J2-1439
> URL: https://issues.apache.org/jira/browse/LOG4J2-1439
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: LOG4J2-1439.patch.txt
>
>
> Spawning from the discussion in LOG4J2-1430, this feature request is to add 
> support for using JCTools MPSC bounded lock-free queue in AsyncAppender.
> As the work for this issue stems from the work in 1430, the code is all 
> contained in the same branch.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-30 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15356953#comment-15356953
 ] 

Anthony Maire commented on LOG4J2-1430:
---

the patch has been attached to LOG4J2-1439

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> jctools-vs-conversant-service-time.png, log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> BenchmarkMode  Samples
>  ScoreError  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Paramsthrpt   20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Paramsthrpt   20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param  thrpt   20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params thrpt   20   
> 9021650.667 ± 351102.255  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params thrpt   20   
> 8079337.905 ± 115824.975  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput4Params 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-27 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15351150#comment-15351150
 ] 

Anthony Maire commented on LOG4J2-1430:
---

I will finish the JCTools wrapper implementation (since there is a new JCTool 
release with a bug fix that I needed) and release a cleaner patch in a very 
near future

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> jctools-vs-conversant-service-time.png, log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> BenchmarkMode  Samples
>  ScoreError  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Paramsthrpt   20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Paramsthrpt   20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param  thrpt   20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params thrpt   20   
> 9021650.667 ± 351102.255  ops/s
> 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-27 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15350921#comment-15350921
 ] 

Anthony Maire commented on LOG4J2-1430:
---

I made some additional tests on a dual xeon server (with 24 hyperthreads) to 
see how both queues behave under heavy contention (running lots of threads on a 
4-cores machine does not show contention for these kind of queue since the CAS 
operations will never be contended by a sleeping thread) :

- I first try to made some tests with the original ResponseTimeTest and 
AsyncAppender configuration files. It was pretty hard to find differences 
between both queue, because they both perform pretty well when the thoughput is 
way below the AsyncThread max thorughput. And the small difference between them 
are not significant because the amplitude is orders of magnitude below the GC 
pause caused by the appender itself
- I tried to run 10 threads at 100K msg/s while in the background another 
process was spinning on 18 threads to simulate contention by other processes 
(without tuning, another JVM will run GC on 18 threads on this machine). As 
expected, I saw that the test with Conversant Queue was performing a little 
worse than the JCTools queue, but the difference is not that impressive

I made additionnal tests on the queue by themselves, not used in an 
AsyncAppender (by writing custom RunXXX classes in the ResponseTimeTest 
package) :
- 20 producer threads and 1 consumer thread, all with a spinning wait policy, 
so I expect to see a 2100% CPU usage in top
- Pacer uses a Yield wait strategy (service time was worse with spinning since 
there were more threads than physical cores)
- 20 minutes test (with 1 minute of warmup before)

Since the AsyncThread was able to sustain a throughput of nearly 5 millions 
msg/s on this machine, I tried to run the consumer threads at 250 K ops/s each 
(so roughly 6 trillions operation for the whole test)
- JCTools queue was able to execute the test, with a *maximum* response time of 
11.8 ms and service time *99.99th percentile* is 2.8µs
- Conversant queue failed, only 2.82 trillions ops were made, *median* response 
time is 10.4ms and *median* service time is 3.1µs. Moreover CPU usage never 
reached 2100%, but was often between 1600% and 1800% because of the 
LockSupport.parkNanos() call in Condition.progressiveYield() in their offer 
method when producers are contending each other (this queue is not lock-free, 
although the contention impact is way lighter than in a lock-based 
implementation such as ABQ)

I made another test with 50K ops/s per threads (1M ops/s total, way below 
AsyncThread max throughput), both queues are able to execute the test, but 
Conversant queue service time is worse than the JCTools one (which has nearly 
the same service time in the two tests). Look at the vertical line in the 
attached screenshot for the Conversant curve (blue) : this 50µs gap match the 
minimal pause achievable with lockSupport.parkNanos() on this machine. It shows 
that it's the same contention issue between producer with prevented this queue 
to succeed in the 5M ops/s test.

=> JCtools queue provides more predictable latencies and a higher maximal 
throughput because of its lock-free nature

By the way, I wonder if it does really make sense to provide specific 
alternative queues support for the async appender, since this support 
implicitly mean "we tested this queue, it rocks hard". Being able to inject a 
custom factory is maybe enough. Currently the ABQ is probably enough for 99% of 
the users of asynchronous logging, and "performance hardcore users" should use 
AsyncLogger since it is garbage-free and more efficient (and using the original 
LMAX disruptor will offer some possibilities than will never be possible with a 
queue, such as LOG4J2-1424) ...

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> jctools-vs-conversant-service-time.png, log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  

[jira] [Updated] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-27 Thread Anthony Maire (JIRA)

 [ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Maire updated LOG4J2-1430:
--
Attachment: jctools-vs-conversant-service-time.png

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> jctools-vs-conversant-service-time.png, log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> BenchmarkMode  Samples
>  ScoreError  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Paramsthrpt   20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Paramsthrpt   20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param  thrpt   20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params thrpt   20   
> 9021650.667 ± 351102.255  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params thrpt   20   
> 8079337.905 ± 115824.975  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput4Params thrpt   20   
> 7347356.788 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-23 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15346328#comment-15346328
 ] 

Anthony Maire commented on LOG4J2-1430:
---

I had a look on the ResponseTimeTest, it's exactly what I was thinking about 
for a good latency benchmark. I will try to find some time to play with it


> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> BenchmarkMode  Samples
>  ScoreError  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Paramsthrpt   20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Paramsthrpt   20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param  thrpt   20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params thrpt   20   
> 9021650.667 ± 351102.255  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params thrpt   20   
> 8079337.905 ± 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-22 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15345211#comment-15345211
 ] 

Anthony Maire commented on LOG4J2-1430:
---

The approch of using a NoOp appender for testing AsyncAppender isn't a bad idea 
at all. The "issue" is that it's easy to do a wrong interpretation of the 
result.

The issue is that we naturally expect the queue to be the bottleneck (and in 
fact it is the bottleneck in a lots of ABQ tests), but with those 
high-performance queue, the single-threaded code between queue.take() and 
appender.append() is the bottleneck. With 8 producer threads, I had the 
consumer thread 100% busy but producers were not. So we are measuring something 
that doesn't reflect the real world (since in real application we don't expect 
the queue to be always full), and this benchmark is not good to show which 
queue implementation is the best. 

I'm probably biaised because I'm working for a low-latency company, but the 
best metrics to evaluate the "performance"  of the AsyncAppender are latency, 
and latency is really damn hard to measure (cf this excellent talk from Gil 
Tene on this subject: https://www.infoq.com/presentations/latency-response-time 
). For such cases, I usually mesure latency percentiles at a load that is below 
maximal throughput (since all systems suck from a latency point of view when 
running near their saturation point) by recording  all individual results in a 
HdrHistogram. I don't think that is something that can be done with JMH, I 
usually write custom load generator to do this

A major caveat (among lots of others) with measuring latency is that the 
production environment may be very different from the idealized environment you 
have when doing benchmarks. Especially, the system might be overloaded (i.e 
there are more active threads in the whole machine, including kernel threads, 
than number of available core) and/or virtualized. In this situation the kind 
of concurrency (wait-free, lock-free ...) will matter a lot since threads can 
be suspended for several milliseconds

Showing this effect in a benchmark is hard, because it occurs at a very low 
frequency with "production-like" load and there are other causes of low 
frequency outliers. It may require lots of analysis. The usual way to show that 
an algorithm as poorer concurrency properties is to increase the throughput 
and/or the number of threads of the load generator until some candidates behave 
poorly: lock-free algorithm will scale better than blocking algorithms.

But with current AsyncAppender code, doing this kind of test is not possible, 
because the async thread isn't performant enough. To compare queues, the best 
thing is to compare them without being integrated into AsyncAppender. 
Otherwise, we need to optimise the AsyncAppender itself a lot to show 
differences between queues, but this might be a pretty useless effort since 
current code is probably efficient enough with a real underlying appender. 

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-22 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15344294#comment-15344294
 ] 

Anthony Maire commented on LOG4J2-1430:
---

A made another test with the JCTools queue with the "yield then park" wait 
strategy that cause the throughput to drop earlier. I changed the 
implementation of the put() method to use offerIfBelowThreshold() instead of 
offer() (it requires a snapshot version of JCTools, this method is broken in 
1.2, the fix has been merged earlier today). By this way, I made sure that at 
least 32 slots in the underlying buffer are free, so producer threads won't 
cause a cache miss for the consumer thread anymore.

Now the throughput is roughly the same as the "pure park" wait strategy. 

That confirms what the feelings i wrote in the previous post : *for any good 
enough (i.e non-blocking) implementation of the queue, the AsyncAppender 
throughput performance test does only measure how bad the consumer thread is 
impacted by the wait strategy when the queue is full*

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  

[jira] [Comment Edited] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-20 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340369#comment-15340369
 ] 

Anthony Maire edited comment on LOG4J2-1430 at 6/20/16 8:52 PM:


I made further tests today, and while I still believe that the Conversant 
Disruptor is not the best alternative because of the coupling between the 
claiming slot order an the data publishing order, the main issue is not there.

These throughput benchmarks only mesure one thing: how much is the consumer 
thread impacted by the wait strategy. And as soon as the strategy does not use 
condition variables, the more the consumer threads are paused, the better the 
throughput will be

As I said in a previous comment, even with 24 or 36 threads, only 12 
hyperthreads were busy with the JCTools queue with a LockSupport.parkNanos wait 
strategy. So most of the time, the queue is full. I tried to play with the 
implementation of the wait strategy in my JCTools blocking wrapper to use 
something that is more like the one used in LMAX or Conversant disruptor with 
some spin/yield before falling back to park.

The result is not what I expected : the yield loop made the JCTool queue 
throughput lower !! I think it can be explained by false sharing : when the 
queue is nearly full, the producers will write to slots that are adjacent to 
the one that is read by the consumer. Every time an event is inserted in the 
queue, that cache line is evicted from consumer core L1 cache, causing a cache 
miss. If producers are paused for a longer time, then the consumer can append 
events without these cache misses for a longer time, and the throughput is 
better

That's why the Conversant Disruptor was performing way better under window than 
under Linux : the pause time of parkNanos is 20 times longer on windows and it 
was enough to avoid the spin/yield loop negative impact in the wait strategy.

The main question with these benchmarks is: what do we want to measure, how can 
we say that an implementation is more "performant" than another one ?  The 
throughput with a no-op appender is maybe not the best metric: in real 
applications, there will be real filters + appenders that will not support such 
a throughput, so we are mesuring something that does not exist in the real 
world.

In my opinion, there are 2 usecases where an async appender is useful :
- when we need to increase the maximal throughput of the underlying appender 
thanks to batching (cf RandomAccessFile appender with immediateFlush = false). 
The queue implementation has few impact on this
- when we need to keep latency low in the application thread for a realistic 
load (i.e we don't expect the queue to be always nearly full, if this does 
happen, then another part of the system need improvements)

As Remko said, log4J is used in a lot of situations that we cannot control. And 
saying something like "you shouldn't have more active threads than cores in 
your app" is not a solution, if the server is used for other apps it will not  
be enough. Being lock-free on the producer side (as soon as the queue is not 
full) is a mandatory characteristic in my opinion.

The intrisinc latency of the queue is often not the main characterisic : there 
are lot's of things done in the main thread before enqueuing, especially with 
parametrized messages. The call to the logger can take a few microseconds, 
whereas any reasonable high-performance collection will enqueue in something 
like 100ns under low contention (and the saturation rate of the underlying 
appender will often happen way before achieving high contention on the queue).


was (Author: anthony maire):
I made further tests today, and while I still believe that the Conversant 
Disruptor is not the best alternative because of the coupling between the 
claiming slot order an the data publishing order, the main issue is not there.

These throughput benchmarks only mesure one thing: how much is the consumer 
thread impacted by the wait strategy. And as soon as the strategy does not use 
condition variables, the more the consumer threads are paused, the better the 
throughput will be

As I said in a previous comment, even with 24 or 36 threads, only 12 
hyperthreads were busy with the JCTools queue with a LockSupport.parkNanos wait 
strategy. So most of the time, the queue is full. I tried to play with so 
implementation of the wait strategy in my JCTools blocking wrapper to use 
something that is more like the one used in LMAX or Conversant disruptor with 
some spin/yield before falling back to park.

The result is not what I expected : the yield loop made the JCTool queue 
throughput lower !! I think it can be explained by false sharing : when the 
queue is nearly full, the producers will write to slots that are adjacent to 
the one that is read by the consumer. Every time an event is inserted in the 
queue, that cache line is evicted 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-20 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15340369#comment-15340369
 ] 

Anthony Maire commented on LOG4J2-1430:
---

I made further tests today, and while I still believe that the Conversant 
Disruptor is not the best alternative because of the coupling between the 
claiming slot order an the data publishing order, the main issue is not there.

These throughput benchmarks only mesure one thing: how much is the consumer 
thread impacted by the wait strategy. And as soon as the strategy does not use 
condition variables, the more the consumer threads are paused, the better the 
throughput will be

As I said in a previous comment, even with 24 or 36 threads, only 12 
hyperthreads were busy with the JCTools queue with a LockSupport.parkNanos wait 
strategy. So most of the time, the queue is full. I tried to play with so 
implementation of the wait strategy in my JCTools blocking wrapper to use 
something that is more like the one used in LMAX or Conversant disruptor with 
some spin/yield before falling back to park.

The result is not what I expected : the yield loop made the JCTool queue 
throughput lower !! I think it can be explained by false sharing : when the 
queue is nearly full, the producers will write to slots that are adjacent to 
the one that is read by the consumer. Every time an event is inserted in the 
queue, that cache line is evicted from consumer core L1 cache, causing a cache 
miss. If producers are paused for a longer time, then the consumer can append 
events without these cache misses for a longer time, and the throughput is 
better

That's why the Conversant Disruptor was performing way better under window than 
under Linux : the pause time of parkNanos is 20 times longer on windows and it 
was enough to avoid the spin/yield loop negative impact in the wait strategy.

The main question with these benchmarks is: what do we want to measure, how can 
we say that an implementation is more "performant" than another one ?  The 
throughput with a no-op appender is maybe not the best metric: in real 
applications, there will be real filters + appenders that will not support such 
a throughput, so we are mesuring something that does not exist in the real 
world.

In my opinion, there are 2 usecases where an async appender is useful :
- when we need to increase the maximal throughput of the underlying appender 
thanks to batching (cf RandomAccessFile appender with immediateFlush = false). 
The queue implementation has few impact on this
- when we need to keep latency low in the application thread for a realistic 
load (i.e we don't expect the queue to be always nearly full, if this does 
happen, then another part of the system need improvements)

As Remko said, log4J is used in a lot of situations that we cannot control. And 
saying something like "you shouldn't have more active threads than cores in 
your app" is not a solution, if the server is used for other apps it will not  
be enough. Being lock-free on the producer side (as soon as the queue is not 
full) is a mandatory characteristic.

The intrisinc latency of the queue is often not the main characterisic : there 
are lot's of things done in the main thread before enqueuing, especially with 
parametrized messages. The call to the logger can take a few microseconds, 
whereas any reasonable high-performance collection will enqueue in something 
like 100ns under low contention (and the saturation rate of the underlying 
appender will often happen way before achieving high contention on the queue).

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> 

[jira] [Updated] (LOG4J2-1441) Asynchronous logging + immediateFlush=false appenders : potential data retention on end of batch event

2016-06-20 Thread Anthony Maire (JIRA)

 [ 
https://issues.apache.org/jira/browse/LOG4J2-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Maire updated LOG4J2-1441:
--
Attachment: example_LOG4J2-1441.zip
flush_fail_1441.zip

> Asynchronous logging + immediateFlush=false appenders : potential data 
> retention on end of batch event
> --
>
> Key: LOG4J2-1441
> URL: https://issues.apache.org/jira/browse/LOG4J2-1441
> Project: Log4j 2
>  Issue Type: Bug
>Affects Versions: 2.6
>Reporter: Anthony Maire
> Attachments: example_LOG4J2-1441.zip, flush_fail_1441.zip
>
>
> When using asynchronous logging, suggestion is made to use 
> immediateFlush=false for file-based appenders and rely on the end of batch 
> detection to perform the flush, which is supposed to happen in a timely manner
> As per AsyncLogger JavaDoc
> {quote}
> For best performance, use AsyncLogger with the RandomAccessFileAppender or 
> RollingRandomAccessFileAppender, with immediateFlush=false
> {quote}
> As per File/RandomAccessFile appenders doc on the site
> {quote}
> Flushing after every write is only useful when using this appender with 
> synchronous loggers. Asynchronous loggers and appenders will automatically 
> flush at the end of a batch of events, even if immediateFlush is set to 
> false. This also guarantees the data is written to disk but is more efficient.
> {quote}
> However if there are multiple appenders for the same asynchronous mechanism 
> (either multiple appenderRef for an AsyncAppender or multiple AsyncLoggers 
> since they share the same disruptor instance), the last event of the batch 
> will only flush the appenders that are actually logging it, and data on other 
> appenders won't be flushed.
> I made a small example using 2 asynchronous loggers, same kind of issue 
> should occur with several appenders linked to the same AsyncAppender when the 
> last event is filtered out because of it's level or a custom filter.
> - 10 second after the start, all events are processed but the main thread 
> will not exit because of an infinite loop
> - 2 events where processed by root logger and correctly logged in the console 
> appender
> - only one event has been flushed to the file (the file appender is attached 
> to root logger too)
> - I made a heap dump on the JVM, it shows that the RandomAccessFileManager 
> internal byte buffer has pending data ( byteBuffer.position=61)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1441) Asynchronous logging + immediateFlush=false appenders : potential data retention on end of batch event

2016-06-20 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1441?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15339598#comment-15339598
 ] 

Anthony Maire commented on LOG4J2-1441:
---

I'm not sure I know the code base well enough to provide a clean solution to 
fix this, but I can try :)

> Asynchronous logging + immediateFlush=false appenders : potential data 
> retention on end of batch event
> --
>
> Key: LOG4J2-1441
> URL: https://issues.apache.org/jira/browse/LOG4J2-1441
> Project: Log4j 2
>  Issue Type: Bug
>Affects Versions: 2.6
>Reporter: Anthony Maire
> Attachments: example_LOG4J2-1441.zip, flush_fail_1441.zip
>
>
> When using asynchronous logging, suggestion is made to use 
> immediateFlush=false for file-based appenders and rely on the end of batch 
> detection to perform the flush, which is supposed to happen in a timely manner
> As per AsyncLogger JavaDoc
> {quote}
> For best performance, use AsyncLogger with the RandomAccessFileAppender or 
> RollingRandomAccessFileAppender, with immediateFlush=false
> {quote}
> As per File/RandomAccessFile appenders doc on the site
> {quote}
> Flushing after every write is only useful when using this appender with 
> synchronous loggers. Asynchronous loggers and appenders will automatically 
> flush at the end of a batch of events, even if immediateFlush is set to 
> false. This also guarantees the data is written to disk but is more efficient.
> {quote}
> However if there are multiple appenders for the same asynchronous mechanism 
> (either multiple appenderRef for an AsyncAppender or multiple AsyncLoggers 
> since they share the same disruptor instance), the last event of the batch 
> will only flush the appenders that are actually logging it, and data on other 
> appenders won't be flushed.
> I made a small example using 2 asynchronous loggers, same kind of issue 
> should occur with several appenders linked to the same AsyncAppender when the 
> last event is filtered out because of it's level or a custom filter.
> - 10 second after the start, all events are processed but the main thread 
> will not exit because of an infinite loop
> - 2 events where processed by root logger and correctly logged in the console 
> appender
> - only one event has been flushed to the file (the file appender is attached 
> to root logger too)
> - I made a heap dump on the JVM, it shows that the RandomAccessFileManager 
> internal byte buffer has pending data ( byteBuffer.position=61)



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Created] (LOG4J2-1441) Asynchronous logging + immediateFlush=false appenders : potential data retention on end of batch event

2016-06-20 Thread Anthony Maire (JIRA)
Anthony Maire created LOG4J2-1441:
-

 Summary: Asynchronous logging + immediateFlush=false appenders : 
potential data retention on end of batch event
 Key: LOG4J2-1441
 URL: https://issues.apache.org/jira/browse/LOG4J2-1441
 Project: Log4j 2
  Issue Type: Bug
Affects Versions: 2.6
Reporter: Anthony Maire


When using asynchronous logging, suggestion is made to use immediateFlush=false 
for file-based appenders and rely on the end of batch detection to perform the 
flush, which is supposed to happen in a timely manner

As per AsyncLogger JavaDoc
{quote}
For best performance, use AsyncLogger with the RandomAccessFileAppender or 
RollingRandomAccessFileAppender, with immediateFlush=false
{quote}

As per File/RandomAccessFile appenders doc on the site
{quote}
Flushing after every write is only useful when using this appender with 
synchronous loggers. Asynchronous loggers and appenders will automatically 
flush at the end of a batch of events, even if immediateFlush is set to false. 
This also guarantees the data is written to disk but is more efficient.
{quote}

However if there are multiple appenders for the same asynchronous mechanism 
(either multiple appenderRef for an AsyncAppender or multiple AsyncLoggers 
since they share the same disruptor instance), the last event of the batch will 
only flush the appenders that are actually logging it, and data on other 
appenders won't be flushed.

I made a small example using 2 asynchronous loggers, same kind of issue should 
occur with several appenders linked to the same AsyncAppender when the last 
event is filtered out because of it's level or a custom filter.

- 10 second after the start, all events are processed but the main thread will 
not exit because of an infinite loop
- 2 events where processed by root logger and correctly logged in the console 
appender
- only one event has been flushed to the file (the file appender is attached to 
root logger too)
- I made a heap dump on the JVM, it shows that the RandomAccessFileManager 
internal byte buffer has pending data ( byteBuffer.position=61)





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-19 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338834#comment-15338834
 ] 

Anthony Maire commented on LOG4J2-1430:
---

In theory you are perfectly right, but

- Most of the JVM currently in the market does support Unsafe, especially all 
the JVM that are usually used in the high-performance community
- The Unsafe features that are required should become standard in JDK 9 
(http://openjdk.java.net/jeps/193), so there will be easy to make this queue 
implementation standard in the near future.
- If you would like the JCTool queue to become the standard for AsyncAppender 
but this dependency to Unsafe is a blocking issue, I can write a portable 
version without Unsafe so that you can test it and make a comparison. It should 
be a little less fast than the original version but it should keep the same 
kind of characteristics under heavy contention. Just let me know if you're 
interested in this :)

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> Benchmark  

[jira] [Updated] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-19 Thread Anthony Maire (JIRA)

 [ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Maire updated LOG4J2-1430:
--
Attachment: log4j2-1430-jctools-tmp-patch.txt

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt, 
> log4j2-1430-jctools-tmp-patch.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  4136761.624 ±  67804.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  6719456.722 ± 187433.301  ops/s
> {code}
> h3. AsyncLogger
> {code}
> BenchmarkMode  Samples
>  ScoreError  Units
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput10Paramsthrpt   20   
> 5075883.371 ± 180465.316  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput11Paramsthrpt   20   
> 4867362.030 ± 193909.465  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput1Param  thrpt   20  
> 10294733.024 ± 226536.965  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput2Params thrpt   20   
> 9021650.667 ± 351102.255  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput3Params thrpt   20   
> 8079337.905 ± 115824.975  ops/s
> o.a.l.l.p.j.AsyncLoggersBenchmark.throughput4Params thrpt   20   
> 7347356.788 ±  66598.738  ops/s
> 

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-19 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338648#comment-15338648
 ] 

Anthony Maire commented on LOG4J2-1430:
---

I will attach my patch right after posting this comment. This is clearly a 
quick & dirty patch to show that this "DisruptorBlockingQueue" is not so 
performant compared to something that is way closer to current "state of the 
art". The way that the JCTool queue is converted to a BlockingQueue does matter 
(as the choice of the wait strategy does matter for the "real" disruptor). I 
removed the LinkedTranferQueue when testing since I had some OutOfMemoryError 
with it and hadn't the time to investigate why

the JCTool queue is free of the caveats than I spotted with the Conversant 
Disruptor. The main issue when dealing with multi-producer scenario is that you 
have 2 orders to consider :
- the order of slots claiming for producers
- the order of data availability.

Both queues solve this issue differently :
- Conversant disruptor has an AtomicLong for each order, and CAS one of them 
with the value of the other => it forces threads to publish data in the same 
orders slot where clamed, which doesn't scale at all with lots of producer. It 
is not a lock-free algorithm since a frozen thread can prevent the whole system 
to make progress
- JCTool queue has a single "Atomic" variable (although it's a volatile long, 
Unsafe is used to add the semantics of an AtomicLong). Elements are stored 
using an ordered store which is a pretty cheap operation, and the consumer will 
see published elements as soon as they are available. Both orders are 
independant, so the only contention between producer is the CAS loop => if a 
producer is temporarily frozen, others producers can make progress (until the 
queue is full of course) => It scales way better

About others caveat :
- JCTool is safe on 32 bits architectures
- Padding is inheritance-based so I'm sure it does work on current JVMs
- About the isEmpy potential issue, I have to think a little more about it to 
be sure, but I think it's OK for both Conversant and JCTools queue, since 
isEmpty() will always be called by the singe consumer threads so there 
shouldn't be any issue


> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  

[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-19 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338427#comment-15338427
 ] 

Anthony Maire commented on LOG4J2-1430:
---

I made some tests with the branch to compare Conversant DisruptorBlockingQueue 
and the MpscArrayQueue from JCTools ( https://github.com/JCTools/JCTools ). 
Since the JCTool queue isn't a BlockingQueue, I extended it to add the 
BlockingQueue methods, using LockSupport.parkNanos(1L) when waiting was needed

I first ran the JMH benchmark on my machine (Windows 7 on core i5-2500K, 4 
cores, no HT) with 8 and 16 threads and there were no noticeable difference 
between both alternative implementations

Then since the LockSupport.parkNanos(1L) behavior is really dependant on the OS 
(cf 
http://vanillajava.blogspot.fr/2012/04/yield-sleep0-wait01-and-parknanos1.html 
), I made some test on a Linux server : CentOS 6.7 with dual Xeon E5-2643 v3 
(24 hyperthreads for the whole machine), the parkNanos pause was around 50µs 
instead of 1ms on windows

There was no noticeable difference up to 8 threads, when the number of thread 
increase, the DisruptorBlockingQueue was performing bad, and has no clear 
advantage on ArrayBlockingQueue for 36 threads. the MPSC queue has a constant 
throughput

Here are the rough figure I had in the following order ABQ, Conversant 
Disruptor and JCTools MPSC :
{noformat}
1 thread  : 925k 3M 3.15M
8 threads : 600k 2.95M 3.15M
12 threads : 550k 2.2M 3.10M
16 threads : 500K 1.9M 3.6M
24 threads : 600K 650K 3.45M
36 threads : 530K 600K 3.5M
{noformat}

I assume than the high pause time from parkNanos on windows (used by conversant 
wait strategy too) was artificially reducing contention between producers and 
allow it to perform pretty well.

When I compare latency profiles of both implementation (with JMH "SampleTime" 
mode) for 8 threads, JCTool perform better at higher percentile (the 99.% 
cannot be trust because the number of samples is way too low)

{noformat}
Conversant Disruptor results :

Result: 2761.833 ±(99.9%) 26.908 ns/op [Average]
  Statistics: (min, avg, max) = (344.000, 2761.833, 16039936.000), stdev = 
11613.072
  Confidence interval (99.9%): [2734.925, 2788.741]
  Samples, N = 2016821
mean =   2761.833 ±(99.9%) 26.908 ns/op
 min =344.000 ns/op
  p( 0.) =344.000 ns/op
  p(50.) =   2148.000 ns/op
  p(90.) =   5024.000 ns/op
  p(95.) =   6728.000 ns/op
  p(99.) =  11440.000 ns/op
  p(99.9000) =  18912.000 ns/op
  p(99.9900) =  27328.000 ns/op
  p(99.9990) =  35509.234 ns/op
  p(99.) = 2346156.268 ns/op
 max = 16039936.000 ns/op



JCTools MPSC results

Result: 3115.571 ±(99.9%) 23.081 ns/op [Average]
  Statistics: (min, avg, max) = (325.000, 3115.571, 6905856.000), stdev = 
9438.165
  Confidence interval (99.9%): [3092.490, 3138.651]
  Samples, N = 1810522
mean =   3115.571 ±(99.9%) 23.081 ns/op
 min =325.000 ns/op
  p( 0.) =325.000 ns/op
  p(50.) =   2984.000 ns/op
  p(90.) =   4352.000 ns/op
  p(95.) =   4816.000 ns/op
  p(99.) =   5736.000 ns/op
  p(99.9000) =   6872.000 ns/op
  p(99.9900) =   7976.000 ns/op
  p(99.9990) =  10652.653 ns/op
  p(99.) = 6872656.978 ns/op
 max = 6905856.000 ns/op
{noformat}

Interesting fact: with 16 to 36 threads, top reported a cpu usage for the 
JCtool queue that doesn't exceed 1200% (when Conversant queue was using the 
full CPU), so half of the cpu cycles of the machine where not used. I think 
there is something that can be improved, probably around my BlockingQueue quick 
and dirty implementation.

Anyway, I'm really not sure that using Conversant DisruptorBlockingQueue is a 
good alternative to ABQ.



> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
> Attachments: AsyncAppenderPerf01.txt
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> 

[jira] [Commented] (LOG4J2-1438) Add method getParameter() to ObjectMessage (and ReusableObjectMessage)

2016-06-19 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1438?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15338406#comment-15338406
 ] 

Anthony Maire commented on LOG4J2-1438:
---

I think that having getParametersCount() + getParameter(int) would be better: 
it may be useful to write some gc-free custom filters that need access to 
parameters for instance

> Add method getParameter() to ObjectMessage (and ReusableObjectMessage)
> --
>
> Key: LOG4J2-1438
> URL: https://issues.apache.org/jira/browse/LOG4J2-1438
> Project: Log4j 2
>  Issue Type: Improvement
>  Components: API
>Affects Versions: 2.6, 2.6.1
>Reporter: Remko Popma
> Fix For: 2.7
>
>
> While analysing LOG4J2-1397 I found it is currently not possible for client 
> code to get a reference to the Object in the ObjectMessage without creating a 
> temporary (single-element) array (returned from {{getParameters()}}.
> This ticket proposes to add a method to the ObjectMessage and 
> ReusableObjectMessage interface that gives direct access to the object 
> parameter.
> I'm flexible with regards to the name of this new method. I thought of 
> {{getParameter()}} but an alternative could be {{getObject()}} or something 
> else.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1430) Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender

2016-06-18 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1430?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15337649#comment-15337649
 ] 

Anthony Maire commented on LOG4J2-1430:
---

There was a thread on the mechanical sympathy group on this queue : 
https://groups.google.com/forum/#!searchin/mechanical-sympathy/conversant/mechanical-sympathy/c5x0c2Zsfpc/BmcnRihGFAAJ
I had a very quick look on the current code, and the same issues seems to be 
still present:

- contention between producers, will not scale if there are more producers than 
number of cores. That's basically the difference that there was between 
disruptor 3.x and earlier version, more details in the second half of this 
conference from Michael Barker (LMAX) : 
https://www.youtube.com/watch?v=VBnLW9mKMh4
- Potentially broken on 32 bits system (write to non-volatile long are not 
guaranteed to be atomic)
- Not sure that the padding mechanism used in their code is working, the JVM 
can re-order fields. The most secure way to pad is to use inheritance

More over, due to the asynchronous nature of this kind of queue, i'm not sure 
that we will always detect when the queue is empty, which is critical since the 
isEmpty() method is called to detect if flush are required with immediateFlush 
= false. Further investigation are needed on this subject

> Add optional support for Conversant DisruptorBlockingQueue in AsyncAppender
> ---
>
> Key: LOG4J2-1430
> URL: https://issues.apache.org/jira/browse/LOG4J2-1430
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Appenders
>Affects Versions: 2.6.1
>Reporter: Matt Sicker
>Assignee: Matt Sicker
> Fix For: 2.7
>
>
> [Conversant Disruptor|https://github.com/conversant/disruptor] works as an 
> implementation of BlockingQueue that is much faster than ArrayBlockingQueue. 
> I did some benchmarks earlier and found it to be a bit faster:
> h3. AsyncAppender/ArrayBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>Score   Error  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  1101267.173 ± 17583.204  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  1128269.255 ± 12188.910  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  1525470.805 ± 56515.933  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  1789434.196 ± 42733.475  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  1803276.278 ± 34938.176  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  1468550.776 ± 26402.286  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  1322304.349 ± 22417.997  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  1179756.489 ± 16502.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  1324660.677 ± 18893.944  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  1309365.962 ± 19602.489  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput9Params thrpt   20 
>  1422144.180 ± 20815.042  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughputSimple  thrpt   20 
>  1247862.372 ± 18300.764  ops/s
> {code}
> h3. AsyncAppender/DisruptorBlockingQueue
> {code}
> Benchmark   Mode  Samples 
>ScoreError  Units
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput10Paramsthrpt   20 
>  3704735.586 ±  59766.253  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput11Paramsthrpt   20 
>  3622175.410 ±  31975.353  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput1Param  thrpt   20 
>  6862480.428 ± 121473.276  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput2Params thrpt   20 
>  6193288.988 ±  93545.144  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput3Params thrpt   20 
>  5715621.712 ± 131878.581  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput4Params thrpt   20 
>  5745187.005 ± 213854.016  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput5Params thrpt   20 
>  5307137.396 ±  88135.709  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput6Params thrpt   20 
>  4953015.419 ±  72100.403  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput7Params thrpt   20 
>  4833836.418 ±  52919.314  ops/s
> o.a.l.l.p.j.AsyncAppenderLog4j2Benchmark.throughput8Params thrpt   20 
>  4353791.507 ±  79047.812  ops/s
> 

[jira] [Closed] (LOG4J2-1415) Avoid calling toString() on auto-boxed primitive message parameters

2016-06-11 Thread Anthony Maire (JIRA)

 [ 
https://issues.apache.org/jira/browse/LOG4J2-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Maire closed LOG4J2-1415.
-

tested on master, it works as expected

> Avoid calling toString() on auto-boxed primitive message parameters
> ---
>
> Key: LOG4J2-1415
> URL: https://issues.apache.org/jira/browse/LOG4J2-1415
> Project: Log4j 2
>  Issue Type: Improvement
>Affects Versions: 2.6
>Reporter: Anthony Maire
>Assignee: Remko Popma
>Priority: Minor
> Fix For: 2.6.2
>
>
> When using Log4j 2 through the SLF4J binding, the Unboxer mecanism is not 
> available and a parameterized message with a primitive type parameter will be 
> auto-boxed.
> Then this boxed value will be formatted by 
> ParameterFormatter.recursiveDeepToString() and further allocations will happen
> To lower allocation rate for SLF4J users, 
> ParameterFormatter.appendSpecialTypes() should handle boxed primitive types 
> too.
> {code}
> private static boolean appendSpecialTypes(final Object o, final StringBuilder 
> str) {
> ...
> } else if (o instanceof Long) {
> str.append(((Long) o).longValue());
> return true;
> } else if (o instanceof Integer) {
> str.append(((Integer) o).intValue());
> return true;
> } else if (o instanceof Double) {
> str.append(((Double) o).doubleValue());
> return true;
> } // similarly for float, short, boolean and char.
> ...
> }
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Updated] (LOG4J2-1421) Custom plugins packaged in jar files are not loaded by "packages" property mecanism

2016-06-10 Thread Anthony Maire (JIRA)

 [ 
https://issues.apache.org/jira/browse/LOG4J2-1421?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Anthony Maire updated LOG4J2-1421:
--
Attachment: log4j-plugin-load-issue.zip

minimal example attached : when the DefaultFilter.java file is in the eclipse 
project, the plugin does work. If it is not present and there is the jar file 
instead, it doesn't work (the application verifies if the DefaultFilter class 
is available in the classpath)

> Custom plugins packaged in jar files are not loaded by "packages" property 
> mecanism
> ---
>
> Key: LOG4J2-1421
> URL: https://issues.apache.org/jira/browse/LOG4J2-1421
> Project: Log4j 2
>  Issue Type: Bug
>  Components: Plugins
>Affects Versions: 2.6
>Reporter: Anthony Maire
> Attachments: log4j-plugin-load-issue.zip
>
>
> the "packages" attribute in configuration file works properly to load custom 
> plugins from Eclipse. However, if the plugin class is packaged alone in a jar 
> file (without the Log4j2Plugins.dat file in it), the "packages" property load 
> mecanism is not working.
> 2016-06-09 19:40:44,823 main DEBUG Took 0,001222 seconds to load 0 plugins 
> from package test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Created] (LOG4J2-1421) Custom plugins packaged in jar files are not loaded by "packages" property mecanism

2016-06-10 Thread Anthony Maire (JIRA)
Anthony Maire created LOG4J2-1421:
-

 Summary: Custom plugins packaged in jar files are not loaded by 
"packages" property mecanism
 Key: LOG4J2-1421
 URL: https://issues.apache.org/jira/browse/LOG4J2-1421
 Project: Log4j 2
  Issue Type: Bug
  Components: Plugins
Affects Versions: 2.6
Reporter: Anthony Maire


the "packages" attribute in configuration file works properly to load custom 
plugins from Eclipse. However, if the plugin class is packaged alone in a jar 
file (without the Log4j2Plugins.dat file in it), the "packages" property load 
mecanism is not working.

2016-06-09 19:40:44,823 main DEBUG Took 0,001222 seconds to load 0 plugins from 
package test



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1397) Support ByteBufferLayout

2016-06-09 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322196#comment-15322196
 ] 

Anthony Maire commented on LOG4J2-1397:
---

I agree that this is a narrow use case

Another possible approach is to let the user inject it's implementation of the 
ringbuffer slot factory (and provide the method to transfer data from the 
application thread to the ringbuffer slot, then from the slot to the background 
thread). Does it seems OK to you ?

Basically all use cases where the user want to access parameters at the 
appender level (background formatting, custom filters ...) are potentially 
broken with async logger if these parameters are mutable. And it will almost 
always be narrow use cases. I think that it can be a good solution to let the 
user a way to copy the piece of data that need to be copied for its use case





> Support ByteBufferLayout
> 
>
> Key: LOG4J2-1397
> URL: https://issues.apache.org/jira/browse/LOG4J2-1397
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Layouts
>Affects Versions: 2.6
>Reporter: Remko Popma
>
> As requested by Kirk Pepperdine on the Mechanical Sympathy [mailing 
> list|https://groups.google.com/d/msg/mechanical-sympathy/klefjRqlpQE/-1WF59IGAwAJ]:
> Support a new Layout that takes ObjectMessages containing a ByteBuffer 
> payload and write them to the appender without modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1397) Support ByteBufferLayout

2016-06-09 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322161#comment-15322161
 ] 

Anthony Maire commented on LOG4J2-1397:
---

It just doesn't work because of different orderings in the preallocated array 
and the ringbuffer. Let's take a simple example:
- 2 threads
- a ringbuffer with 2 slots
So we need a preallocated array with 4 buffers

- Thread A get sequence #1 from the atomic counter, then get interrupted by the 
system scheduler before doing the copy
- Thread B is still logging, it get sequence #2, #3, #4 from the counter, 
pushed 2 events to the async logger (so the ringbuffer is full) and is blocked 
trying to push the third
- Async Logger processed the first event submitted by thread B
- Thread B can push another event (with preallocated buffer sequence #4)
- Thread B wants to log another event, it get sequence #5 for the preallocated 
array
- Thread A is resumed by the scheduler

=> Data race, both thread will copy data in the same preAllocated buffer

> Support ByteBufferLayout
> 
>
> Key: LOG4J2-1397
> URL: https://issues.apache.org/jira/browse/LOG4J2-1397
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Layouts
>Affects Versions: 2.6
>Reporter: Remko Popma
>
> As requested by Kirk Pepperdine on the Mechanical Sympathy [mailing 
> list|https://groups.google.com/d/msg/mechanical-sympathy/klefjRqlpQE/-1WF59IGAwAJ]:
> Support a new Layout that takes ObjectMessages containing a ByteBuffer 
> payload and write them to the appender without modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1397) Support ByteBufferLayout

2016-06-09 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15322038#comment-15322038
 ] 

Anthony Maire commented on LOG4J2-1397:
---

At first sight, this approach may work correctly in a single-threaded context, 
but I think you need 1 more preallocated buffer than the ringbuffer size (so 
need to use a modulo instead of a binary mask).

However in our application, this code can be invoked by several threads.
Since the order in which buffers are obtained from the pre-allocated pool might 
not be the same as the order they are in the ringbuffer, the application will 
then need a callback to know that a buffer has been released (i.e logged to 
disk).



> Support ByteBufferLayout
> 
>
> Key: LOG4J2-1397
> URL: https://issues.apache.org/jira/browse/LOG4J2-1397
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Layouts
>Affects Versions: 2.6
>Reporter: Remko Popma
>
> As requested by Kirk Pepperdine on the Mechanical Sympathy [mailing 
> list|https://groups.google.com/d/msg/mechanical-sympathy/klefjRqlpQE/-1WF59IGAwAJ]:
> Support a new Layout that takes ObjectMessages containing a ByteBuffer 
> payload and write them to the appender without modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1397) Support ByteBufferLayout

2016-06-08 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15321285#comment-15321285
 ] 

Anthony Maire commented on LOG4J2-1397:
---

Basically, a layout that have a dependency on a mutable object will not be 
compatible with asynchronous logger/appender. A copy of the content has to be 
made. And to avoid excessive allocation, the copy should be reusable.

If we want to safely use the buffer content without excessive allocation, we 
already have some solutions that use String (such as wrapping it in a 
StringBuilderFormattable that decode the buffer content into a StringBuilder). 
But this is slow (it takes a few extra microseconds for a message of a few 
hundreds bytes compared to bulk byte copy)

If we want to use a ByteBufferLayout, that means that these few microseconds 
matters, so it's nearly 100% sure that asynchronous logging will be used. 
Without some byte copy support provided by the async logger, I'm afraid that 
this layout can't be really useful.

> Support ByteBufferLayout
> 
>
> Key: LOG4J2-1397
> URL: https://issues.apache.org/jira/browse/LOG4J2-1397
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Layouts
>Affects Versions: 2.6
>Reporter: Remko Popma
>
> As requested by Kirk Pepperdine on the Mechanical Sympathy [mailing 
> list|https://groups.google.com/d/msg/mechanical-sympathy/klefjRqlpQE/-1WF59IGAwAJ]:
> Support a new Layout that takes ObjectMessages containing a ByteBuffer 
> payload and write them to the appender without modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1397) Support ByteBufferLayout

2016-06-08 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320175#comment-15320175
 ] 

Anthony Maire commented on LOG4J2-1397:
---

Here is my usecase:
- The application is receiving FIX messages (Ascii-based protocol used in 
financial services) at a rate from a few thousands to a hundred thousands 
messages per second
- The application need to log every message received (mostly for legal and 
auditing reasons)
- Latency is critical, so logging need to be asynchronous since the jitter 
induced by kernel write back can be very high at this rate

Our (simplified) architecture is something like this :

{code}
public class Connection {

private SocketChannel channel;
private ByteBuffer buffer;
private MessageListener listener;

...

//called when the main loop involving a NIO Selector told us that there 
is data to process for this socket
public void onDataReady(){ 
channel.read(buffer);
if(containsCompleteMessage(buffer)){
listener.onIncomingMessage(this, buffer);
}
}

public String toString(){
return "Connection [remoteAddress=" + 
channel.getRemoteAddress() + "]";
}
}

public class MessageListener(){

private static final Logger messageLog = 
LoggerFactory.getLogger("message.logger");
private static final Logger log = 
LoggerFactory.getLogger(MessageListener.class);
private FIXMessageCodec codec;

...

public void onIncomingMessage(ByteBuffer buffer, Connection connection){
messageLog.info("Message on {}: IN {}", connection, buffer);
FIXMessage businessMessage = codec.decode(buffer);
buffer.clear();
log.debug("start business processing for {}", businessMessage);
doBusinessProcessing(businessMessage);
}
}
{code}

So we have a dedicated logger for these raw messages that we need to store on 
disk.
Currently we are using logback through SLF4J, and the first line of 
onIncomingMessage is replaced by something like this:
{code}
if(messageLog.isInfoEnabled()){
ByteBuffer copy = ByteBuffer.allocate(buffer.remaining());
copy.put(buffer);
buffer.rewind();
copy.flip();
messageLog.info("Message on {}: IN {}", connection, copy);
}
{code}

And we have some custom layout that will process the content of the ByteBuffer 
in the background thread of the AsyncAppender.
Making this copy is not currently an issue since logback is already allocating 
roughly 20 times the buffer's size per event, but since we are now looking to 
use log4j for it's low allocation rate, we want to avoid that garbage.

> Support ByteBufferLayout
> 
>
> Key: LOG4J2-1397
> URL: https://issues.apache.org/jira/browse/LOG4J2-1397
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Layouts
>Affects Versions: 2.6
>Reporter: Remko Popma
>
> As requested by Kirk Pepperdine on the Mechanical Sympathy [mailing 
> list|https://groups.google.com/d/msg/mechanical-sympathy/klefjRqlpQE/-1WF59IGAwAJ]:
> Support a new Layout that takes ObjectMessages containing a ByteBuffer 
> payload and write them to the appender without modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1397) Support ByteBufferLayout

2016-06-08 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1397?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15320112#comment-15320112
 ] 

Anthony Maire commented on LOG4J2-1397:
---

The issue with TL is that we need to pass the copy between 2 threads through 
the disruptor. That means that there should be some support for swapping cached 
buffers in MutableLogEvent for example (same idea that the parameter swap that 
was made for LOG4J2-1342). Without this kind of support, we need either to 
create a defensive copy (and create garbage) or to have a shared pool of 
ByteBuffer (and create synchronization performance penalty)


> Support ByteBufferLayout
> 
>
> Key: LOG4J2-1397
> URL: https://issues.apache.org/jira/browse/LOG4J2-1397
> Project: Log4j 2
>  Issue Type: New Feature
>  Components: Layouts
>Affects Versions: 2.6
>Reporter: Remko Popma
>
> As requested by Kirk Pepperdine on the Mechanical Sympathy [mailing 
> list|https://groups.google.com/d/msg/mechanical-sympathy/klefjRqlpQE/-1WF59IGAwAJ]:
> Support a new Layout that takes ObjectMessages containing a ByteBuffer 
> payload and write them to the appender without modification.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org



[jira] [Commented] (LOG4J2-1415) Avoid calling toString() on auto-boxed primitive message parameters

2016-06-07 Thread Anthony Maire (JIRA)

[ 
https://issues.apache.org/jira/browse/LOG4J2-1415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=15318186#comment-15318186
 ] 

Anthony Maire commented on LOG4J2-1415:
---

I don't think the way the "switch" is implemented as a major impact on 
performance

I made a small JMH benchmark, running on a Xeon E5-2643 v3 with JVM 1.8.0_92 
(same kind of results on my desktop with a core i7 2600) and the cost of 
instanceof is barely noticeable (and less than the class name switch)

{noformat}
Benchmark  Mode  Cnt ScoreError  Units
SwitchClassTest.referenceWithCast  avgt   25  1958.885 ±  7.773  ns/op
SwitchClassTest.referenceWithToString  avgt   25  2050.443 ± 10.347  ns/op
SwitchClassTest.useClassName   avgt   25  2015.931 ± 10.190  ns/op
SwitchClassTest.useInstanceOf  avgt   25  1978.622 ± 10.030  ns/op
{noformat}

Here is the benchmark code:
{code}


import java.util.concurrent.TimeUnit;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.BenchmarkMode;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Mode;
import org.openjdk.jmh.annotations.OutputTimeUnit;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;

@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
@Warmup(iterations=5)
@Measurement(iterations=5)
@Fork(value=5)
public class SwitchClassTest {

private static final String STRING_CLASS_NAME = "java.lang.String";
private static final String CHARACTER_CLASS_NAME = 
"java.lang.Character";
private static final String LONG_CLASS_NAME = "java.lang.Long";
private static final String INTEGER_CLASS_NAME = "java.lang.Integer";
private static final String BOOLEAN_CLASS_NAME = "java.lang.Boolean";
private static final String DOUBLE_CLASS_NAME = "java.lang.Double";
private static final String FLOAT_CLASS_NAME = "java.lang.Float";

private Object[] params = new Object[]{"abc", 'a', Long.MAX_VALUE, 
Integer.MAX_VALUE, Double.MAX_VALUE, Float.MAX_VALUE, Boolean.TRUE};
private StringBuilder sb = new StringBuilder();


@Benchmark
public int referenceWithToString() {
for (Object param : params) {
sb.append(param);
}
int size = sb.length();
sb.setLength(0);
return size;
}

@Benchmark
public int referenceWithCast() {
sb.append(params[0]);
sb.append(((Character) params[1]).charValue());
sb.append(((Long) params[2]).longValue());
sb.append(((Integer) params[3]).intValue());
sb.append(((Double) params[4]).doubleValue());
sb.append(((Float) params[5]).floatValue());
sb.append(((Boolean) params[6]).booleanValue());
int size = sb.length();
sb.setLength(0);
return size;
}

@Benchmark
public int useInstanceOf() {
for (Object param : params) {
if (param instanceof String) {
sb.append(param);
} else if (param instanceof Character) {
sb.append(((Character) param).charValue());
} else if (param instanceof Long) {
sb.append(((Long) param).longValue());
} else if (param instanceof Integer) {
sb.append(((Integer) param).intValue());
} else if (param instanceof Double) {
sb.append(((Double) param).doubleValue());
} else if (param instanceof Float) {
sb.append(((Float) param).floatValue());
} else if (param instanceof Boolean) {
sb.append(((Boolean) param).booleanValue());
}
}
int size = sb.length();
sb.setLength(0);
return size;
}

@Benchmark
public int useClassName() {
for (Object param : params) {
String className = param.getClass().getName();
switch (className){
case STRING_CLASS_NAME:
sb.append(param);
break;
case CHARACTER_CLASS_NAME:
sb.append(((Character) param).charValue());
break;
   

[jira] [Created] (LOG4J2-1415) When running in garbage-free mode, boxed primitive types should be formatted without allocating

2016-06-06 Thread Anthony Maire (JIRA)
Anthony Maire created LOG4J2-1415:
-

 Summary: When running in garbage-free mode, boxed primitive types 
should be formatted without allocating
 Key: LOG4J2-1415
 URL: https://issues.apache.org/jira/browse/LOG4J2-1415
 Project: Log4j 2
  Issue Type: Improvement
Affects Versions: 2.6
Reporter: Anthony Maire
Priority: Minor


When using Log4j 2 through the SLF4J binding, the Unboxer mecanism is not 
available and a parameterized message with a primitive type parameter will be 
auto-boxed.

Then this boxed value will be formatted by 
ParameterFormatter.recursiveDeepToString() and further allocations will happen

To lower allocation rate for SLF4J users, 
ParameterFormatter.appendSpecialTypes() should handle boxed primitive types too.

{code}

private static boolean appendSpecialTypes(final Object o, final StringBuilder 
str) {
...
} else if (o instanceof Long) {
str.append(((Long) o).longValue());
return true;
} else if (o instanceof Integer) {
str.append(((Integer) o).intValue());
return true;
} else if (o instanceof Double) {
str.append(((Double) o).doubleValue());
return true;
} // similarly for float, short, boolean and char.
...
}
{code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

-
To unsubscribe, e-mail: log4j-dev-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-dev-h...@logging.apache.org