The first stack trace indicates that the application produces log events faster than they can be consumed. So let’s take a look at the components that sit downstream from the ringbuffer.
You mentioned you’re using a RollingFileAppender. This should usually not be a bottleneck, depending on your rollover policy. How often do you rollover? (RollingRandomAccessFileAppender is a bit faster but not that much, see https://logging.apache.org/log4j/2.x/performance.html#tradeoffs ). Another potential source of slowness is logging with location (https://logging.apache.org/log4j/2.x/performance.html#asyncLoggingWithLocation). Are you using layout options like HTML locationInfo, or one of the patterns %C or $class, %F or %file, %l or %location, %L or %line, %M or %method? Also, do you have any custom components like Filters or Lookups configured? Finally, I don’t know yet what to think of that 2nd stack trace with setPriority... It could be significant, but it could also be a red herring. Let’s look further into that after excluding the usual suspects I mentioned above. Remko. (Shameless plug) Every java main() method deserves http://picocli.info > On Mar 28, 2019, at 0:50, Anthony Maire <[email protected]> wrote: > > Hi John > > If you have this stack trace in the threads that are producing messages, > then your disruptor is full because the appender is not capable of writing > fast enough. > Since you have a single appender, it might be useful to switch > to RollingRandomAccessFileAppender which can batch writes to disc to reach > a greater throughput if you are limited by I/O operations > In your case I would probably run a profiling with Java Mission Control or > any other sampling profiler that you are used too to see if the slowness in > the single appender thread is caused by I/O operations or by the > processing/formatting of the message itself > > Best regards, > Anthony > > Le mer. 27 mars 2019 à 16:23, <[email protected]> a > écrit : > >> Hi all, >> >> I'm using Log4j2 2.11.2 and am evaluating switching from custom async >> appenders to async loggers. >> >> I've been running some scalability tests with async loggers and sometimes >> get good results but mostly get terrible results at high loads. At high >> loads, I typically see many threads in this state: >> >> http-bio-7114-exec-406 - priority:5 - threadId:0x00007f2f2c0cc000 - >> nativeId:0x4b3e - nativeId (decimal):19262 - state:TIMED_WAITING >> stackTrace: >> java.lang.Thread.State: TIMED_WAITING (parking) >> at sun.misc.Unsafe.park(Native Method) >> at java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338) >> at >> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136) >> at >> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105) >> at com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465) >> at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331) >> at >> org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageInfo(AsyncLoggerDisruptor.java:222) >> at >> org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:181) >> at >> org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:166) >> at >> org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:161) >> at >> org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger.java:131) >> at >> org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(ExtendedLoggerWrapper.java:217) >> at >> org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2170) >> at >> org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2125) >> at >> org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2108) >> at >> org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2002) >> at >> org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1974) >> >> Is this a sign that the log rate is too high? >> >> If I count the number of log messages per minute, they go from 70k (1x >> load) to 140k (2x) to 210k (3x) very briefly. That's over 3k messages per >> second. Then the log message volume drops way down to under 1k for a few >> minutes before rising to a few thousand per minute. There were only a few >> hundred log messages per minute when the thread dump was taken. CPU was >> 100% at the start of 3x (no surprise) then dropped to around 50% even >> though application throughput was almost 0. Context switching was higher >> than I've ever seen it (hundreds of thousands.) It's interesting that the >> CPU usage was as high as it was even though application throughput was so >> low. Could there have been a large backlog of log messages that were >> getting processed? >> >> The application did recover when I dropped the load to 1x. CPU usage and >> log volume went back to normal. >> >> Garbage collection logs reflect the fact that almost no work was being >> done. >> >> We're using >> -Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector. >> >> All of the log messages go to a single RollingFileAppender. Disk space >> was about 2/3 used. >> >> Another common thing that I see in the thread dumps when that happens is a >> thread like this blocking many others: >> >> "http-bio-7114-exec-101" #2171 daemon prio=5 os_prio=0 >> tid=0x00007f2f5c083800 nid=0x1702 runnable [0x00007f2ea7e02000] >> java.lang.Thread.State: RUNNABLE >> at java.lang.Thread.setPriority0(Native Method) >> at java.lang.Thread.setPriority(Thread.java:1095) >> at java.lang.Thread.init(Thread.java:417) >> at java.lang.Thread.init(Thread.java:349) >> at java.lang.Thread.<init>(Thread.java:499) >> ... >> >> We have a custom multi-threaded async appender that can handle this load. >> >> Any suggestions? >> >> >> >> Thanks >> >> John >>
