The first stack trace indicates that the application produces log events faster 
than they can be consumed. So let’s take a look at the components that sit 
downstream from the ringbuffer.  

You mentioned you’re using a RollingFileAppender. This should usually not be a 
bottleneck, depending on your rollover policy. How often do you rollover?

(RollingRandomAccessFileAppender is a bit faster but not that much, see 
https://logging.apache.org/log4j/2.x/performance.html#tradeoffs ).

Another potential source of slowness is logging with location 
(https://logging.apache.org/log4j/2.x/performance.html#asyncLoggingWithLocation).
Are you using layout options like HTML locationInfo, or one of the patterns %C 
or $class, %F or %file, %l or %location, %L or %line, %M or %method?

Also, do you have any custom components like Filters or Lookups configured?

Finally, I don’t know yet what to think of that 2nd stack trace with 
setPriority... It could be significant, but it could also be a red herring. 
Let’s look further into that after excluding the usual suspects I mentioned 
above. 

Remko.

(Shameless plug) Every java main() method deserves http://picocli.info

> On Mar 28, 2019, at 0:50, Anthony Maire <[email protected]> wrote:
> 
> Hi John
> 
> If you have this stack trace in the threads that are producing messages,
> then your disruptor is full because the appender is not capable of writing
> fast enough.
> Since you have a single appender, it might be useful to switch
> to RollingRandomAccessFileAppender which can batch writes to disc to reach
> a greater throughput if you are limited by I/O operations
> In your case I would probably run a profiling with Java Mission Control or
> any other sampling profiler that you are used too to see if the slowness in
> the single appender thread is caused by I/O operations or by the
> processing/formatting of the message itself
> 
> Best regards,
> Anthony
> 
> Le mer. 27 mars 2019 à 16:23, <[email protected]> a
> écrit :
> 
>> Hi all,
>> 
>> I'm using Log4j2 2.11.2 and am evaluating switching from custom async
>> appenders to async loggers.
>> 
>> I've been running some scalability tests with async loggers and sometimes
>> get good results but mostly get terrible results at high loads.  At high
>> loads, I typically see many threads in this state:
>> 
>> http-bio-7114-exec-406 - priority:5 - threadId:0x00007f2f2c0cc000 -
>> nativeId:0x4b3e - nativeId (decimal):19262 - state:TIMED_WAITING
>> stackTrace:
>> java.lang.Thread.State: TIMED_WAITING (parking)
>> at sun.misc.Unsafe.park(Native Method)
>> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
>> at
>> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136)
>> at
>> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105)
>> at com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
>> at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
>> at
>> org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageInfo(AsyncLoggerDisruptor.java:222)
>> at
>> org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:181)
>> at
>> org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:166)
>> at
>> org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:161)
>> at
>> org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:131)
>> at
>> org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(ExtendedLoggerWrapper.java:217)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2170)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2125)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2108)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2002)
>> at
>> org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1974)
>> 
>> Is this a sign that the log rate is too high?
>> 
>> If I count the number of log messages per minute, they go from 70k (1x
>> load) to 140k (2x) to 210k (3x) very briefly.  That's over 3k messages per
>> second.  Then the log message volume drops way down to under 1k for a few
>> minutes before rising to a few thousand per minute.  There were only a few
>> hundred log messages per minute when the thread dump was taken.  CPU was
>> 100% at the start of 3x (no surprise) then dropped to around 50% even
>> though application throughput was almost 0.  Context switching was higher
>> than I've ever seen it (hundreds of thousands.)  It's interesting that the
>> CPU usage was as high as it was even though application throughput was so
>> low.  Could there have been a large backlog of log messages that were
>> getting processed?
>> 
>> The application did recover when I dropped the load to 1x.  CPU usage and
>> log volume went back to normal.
>> 
>> Garbage collection logs reflect the fact that almost no work was being
>> done.
>> 
>> We're using
>> -Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector.
>> 
>> All of the log messages go to a single RollingFileAppender.  Disk space
>> was about 2/3 used.
>> 
>> Another common thing that I see in the thread dumps when that happens is a
>> thread like this blocking many others:
>> 
>> "http-bio-7114-exec-101" #2171 daemon prio=5 os_prio=0
>> tid=0x00007f2f5c083800 nid=0x1702 runnable [0x00007f2ea7e02000]
>>   java.lang.Thread.State: RUNNABLE
>>                at java.lang.Thread.setPriority0(Native Method)
>>                at java.lang.Thread.setPriority(Thread.java:1095)
>>                at java.lang.Thread.init(Thread.java:417)
>>                at java.lang.Thread.init(Thread.java:349)
>>                at java.lang.Thread.<init>(Thread.java:499)
>>              ...
>> 
>> We have a custom multi-threaded async appender that can handle this load.
>> 
>> Any suggestions?
>> 
>> 
>> 
>> Thanks
>> 
>> John
>> 

Reply via email to