Thanks Anthony,

I tried the RollingRandomAccessFileAppender but got the same results.

I think the issue is not IO but rather the masking of sensitive data that we 
do.  The actual masking is done in an implementation of AbstractStringLayout.  
This occurs on the single appender thread Log4j2-TF-1-AsyncLogger.

Is there an OOTB way to use more threads for that purpose?  Previously we used 
our own async appender implementation that used multiple threads with good 
results.

Alternatively, is there a point where we can hook in some custom code to do 
that masking on the producer threads?

Thanks


> -----Original Message-----
> From: Anthony Maire <maire.anth...@gmail.com>
> Sent: Wednesday, March 27, 2019 10:50 AM
> To: Log4J Users List <log4j-user@logging.apache.org>
> Subject: Re: Many threads "stuck" when using async loggers
> 
> Hi John
> 
> If you have this stack trace in the threads that are producing messages, then
> your disruptor is full because the appender is not capable of writing fast
> enough.
> Since you have a single appender, it might be useful to switch to
> RollingRandomAccessFileAppender which can batch writes to disc to reach a
> greater throughput if you are limited by I/O operations In your case I would
> probably run a profiling with Java Mission Control or any other sampling
> profiler that you are used too to see if the slowness in the single appender
> thread is caused by I/O operations or by the processing/formatting of the
> message itself
> 
> Best regards,
> Anthony
> 
> Le mer. 27 mars 2019 à 16:23, <john.e.gr...@wellsfargo.com.invalid> a écrit
> :
> 
> > Hi all,
> >
> > I'm using Log4j2 2.11.2 and am evaluating switching from custom async
> > appenders to async loggers.
> >
> > I've been running some scalability tests with async loggers and
> > sometimes get good results but mostly get terrible results at high
> > loads.  At high loads, I typically see many threads in this state:
> >
> > http-bio-7114-exec-406 - priority:5 - threadId:0x00007f2f2c0cc000 -
> > nativeId:0x4b3e - nativeId (decimal):19262 - state:TIMED_WAITING
> > stackTrace:
> > java.lang.Thread.State: TIMED_WAITING (parking) at
> > sun.misc.Unsafe.park(Native Method) at
> > java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:338)
> > at
> >
> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer
> .
> > java:136)
> > at
> >
> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer
> .
> > java:105) at
> > com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465)
> > at com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331)
> > at
> >
> org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMes
> > sageInfo(AsyncLoggerDisruptor.java:222)
> > at
> > org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(A
> > syncLogger.java:181)
> > at
> > org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.ja
> > va:166)
> > at
> > org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTran
> > slator(AsyncLogger.java:161)
> > at
> > org.apache.logging.log4j.core.async.AsyncLogger.logMessage(AsyncLogger
> > .java:131)
> > at
> >
> org.apache.logging.log4j.spi.ExtendedLoggerWrapper.logMessage(Extended
> > LoggerWrapper.java:217)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogg
> > er.java:2170)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(A
> > bstractLogger.java:2125)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractL
> > ogger.java:2108)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.
> > java:2002)
> > at
> > org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogge
> > r.java:1974)
> >
> > Is this a sign that the log rate is too high?
> >
> > If I count the number of log messages per minute, they go from 70k (1x
> > load) to 140k (2x) to 210k (3x) very briefly.  That's over 3k messages
> > per second.  Then the log message volume drops way down to under 1k
> > for a few minutes before rising to a few thousand per minute.  There
> > were only a few hundred log messages per minute when the thread dump
> > was taken.  CPU was 100% at the start of 3x (no surprise) then dropped
> > to around 50% even though application throughput was almost 0.
> > Context switching was higher than I've ever seen it (hundreds of
> > thousands.)  It's interesting that the CPU usage was as high as it was
> > even though application throughput was so low.  Could there have been
> > a large backlog of log messages that were getting processed?
> >
> > The application did recover when I dropped the load to 1x.  CPU usage
> > and log volume went back to normal.
> >
> > Garbage collection logs reflect the fact that almost no work was being
> > done.
> >
> > We're using
> > -
> Dlog4j2.contextSelector=org.apache.logging.log4j.core.async.AsyncLoggerCo
> ntextSelector.
> >
> > All of the log messages go to a single RollingFileAppender.  Disk
> > space was about 2/3 used.
> >
> > Another common thing that I see in the thread dumps when that happens
> > is a thread like this blocking many others:
> >
> > "http-bio-7114-exec-101" #2171 daemon prio=5 os_prio=0
> > tid=0x00007f2f5c083800 nid=0x1702 runnable [0x00007f2ea7e02000]
> >    java.lang.Thread.State: RUNNABLE
> >                 at java.lang.Thread.setPriority0(Native Method)
> >                 at java.lang.Thread.setPriority(Thread.java:1095)
> >                 at java.lang.Thread.init(Thread.java:417)
> >                 at java.lang.Thread.init(Thread.java:349)
> >                 at java.lang.Thread.<init>(Thread.java:499)
> >               ...
> >
> > We have a custom multi-threaded async appender that can handle this
> load.
> >
> > Any suggestions?
> >
> >
> >
> > Thanks
> >
> > John
> >

---------------------------------------------------------------------
To unsubscribe, e-mail: log4j-user-unsubscr...@logging.apache.org
For additional commands, e-mail: log4j-user-h...@logging.apache.org

Reply via email to