Yes, looks the same. In our case it's also under JRE 11 and G1 and string deduplication enabled (I doubt it matters :) but I would mention it. And on running on virtual machines.
On Mon, Mar 15, 2021 at 8:50 AM Carter Kozak <cko...@ckozak.net> wrote: > > This sounds a lot like an issue I discussed with the lmax-disruptor folks > here: https://github.com/LMAX-Exchange/disruptor/issues/307 > I ended up switching my services to log synchronously rather than enqueue > when the buffer is full, which > can produce events out of order, however we log into a structured pipeline > that indexes based on event timestamp. The funny thing is that I haven't seen > any occurrences of this behavior since switching the queue-full policy to > synchronous rather than enqueue, so I'm inclined to believe there's a > disruptor bug somewhere, but I could be wrong. > > On Sun, Mar 14, 2021, at 20:50, Leon Finker wrote: > > Yes unfortunately, all logging stopped and nothing was seen in the log > > file for about 2 hrs till service was restarted. But there is nothing > > pointing to a deadlock in the threads dump we git so far :( However, > > the threads dump captured was part of our standard kill -3 if the > > process isn't exiting normally. So it may not show the correct picture > > during shutdown. > > > > On Sun, Mar 14, 2021 at 3:32 AM Ralph Goers <ralph.go...@dslextreme.com> > > wrote: > > > > > > One other thing. My expectation is that once the buffer is full you > > > should continue to log but at the rate your underlying system can handle. > > > This wouldn’t be a deadlock but if the rate at which you are logging is > > > higher than the system can write the result will be that the application > > > slows down. > > > > > > If you are truly seeing a deadlock then it is possible there is a problem > > > in the disruptor but I have very little experience with that and would > > > need Remko to weigh in. > > > > > > Ralph > > > > > > > On Mar 14, 2021, at 12:26 AM, Ralph Goers <ralph.go...@dslextreme.com> > > > > wrote: > > > > > > > > Yes, it rings bells and really should have a slot on our FAQ page > > > > (although the question really isn’t asked all that frequently). > > > > > > > > First it means that you are logging faster than your system can handle. > > > > The proper response is to figure out why you are logging so much and > > > > try to reduce it or use an alternative destination that is faster. For > > > > example, if you are logging to a file you may need to improve the > > > > hardware or increase the buffer size. > > > > > > > > As you observed your buffer has filled up. The default behavior is for > > > > threads to wait until there is an empty slot. > > > > > > > > Remko is the real expert on this but I will take a stab at answering > > > > your question. The threads below show that they are in > > > > handleRingBufferFull. That method has 3 options to choose from based on > > > > how you have configured the asyncQueueFullPolicy. You basically have 3 > > > > choices: > > > > Wait for an open slot (the default) > > > > Resort to logging synchronously. > > > > Discard log events. > > > > > > > > You should look at > > > > https://logging.apache.org/log4j/2.x/log4j-core/apidocs/org/apache/logging/log4j/core/async/AsyncQueueFullPolicy.html > > > > > > > > <https://logging.apache.org/log4j/2.x/log4j-core/apidocs/org/apache/logging/log4j/core/async/AsyncQueueFullPolicy.html>. > > > > Note that option 2 above (synchronous) is only used by Log4j in > > > > certain circumstances and there isn’t an option to specify that, so the > > > > only real options are the default behavior, discarding, or create a > > > > custom AsyncQueueFullPolicy. > > > > > > > > Ralph > > > > > > > > > > > > > > > >> On Mar 13, 2021, at 10:32 PM, Leon Finker <leon...@gmail.com> wrote: > > > >> > > > >> HI, > > > >> > > > >> Using log4j2 1.13.1 with disruptor 3.4.2. Linux CentOS. JRE 11.0.5 > > > >> Using the following JRE args: > > > >> -DAsyncLogger.RingBufferSize=32768 > > > >> -DLog4jContextSelector=org.apache.logging.log4j.core.async.AsyncLoggerContextSelector > > > >> -Dlog4j.configurationFile=... > > > >> > > > >> The disruptor queue has filled up. And we've observed deadlock with > > > >> the interesting threads stuck like the following. Disruptor processing > > > >> thread seems to be waiting to process events and all the other threads > > > >> locked up on trying to publish the events. > > > >> > > > >> "Log4j2-TF-1-AsyncLogger[AsyncContext@5cb0d902]-1" #23 daemon prio=5 > > > >> os_prio=0 cpu=40099128.73ms elapsed=44102.95s tid=0x00007f4c2ca34000 > > > >> nid=0x611a runnable [0x00007f4bfdcfc000] > > > >> java.lang.Thread.State: RUNNABLE > > > >> at > > > >> com.lmax.disruptor.ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:56) > > > >> at > > > >> com.lmax.disruptor.BatchEventProcessor.processEvents(BatchEventProcessor.java:159) > > > >> at > > > >> com.lmax.disruptor.BatchEventProcessor.run(BatchEventProcessor.java:125) > > > >> at java.lang.Thread.run(java.base@11.0.5/Thread.java:834) > > > >> > > > >> Locked ownable synchronizers: > > > >> - None > > > >> > > > >> > > > >> "xxx" #268 prio=5 os_prio=0 cpu=7051742.32ms elapsed=43967.04s > > > >> tid=0x00007f4c2e6ea800 nid=0x649d runnable [0x00007f49f77dd000] > > > >> java.lang.Thread.State: TIMED_WAITING (parking) > > > >> at jdk.internal.misc.Unsafe.park(java.base@11.0.5/Native Method) > > > >> at > > > >> java.util.concurrent.locks.LockSupport.parkNanos(java.base@11.0.5 > > > >> /LockSupport.java:357) > > > >> at > > > >> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:136) > > > >> at > > > >> com.lmax.disruptor.MultiProducerSequencer.next(MultiProducerSequencer.java:105) > > > >> at > > > >> com.lmax.disruptor.RingBuffer.publishEvent(RingBuffer.java:465) > > > >> at > > > >> com.lmax.disruptor.dsl.Disruptor.publishEvent(Disruptor.java:331) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageWhenQueueFull(AsyncLoggerDisruptor.java:230) > > > >> - locked <0x0000000701017288> (a java.lang.Object) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:246) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:230) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:225) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.access$000(AsyncLogger.java:67) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger$1.log(AsyncLogger.java:152) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.log(AsyncLogger.java:136) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2198) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2152) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2135) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2016) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1875) > > > >> at > > > >> org.apache.logging.slf4j.Log4jLogger.info(Log4jLogger.java:179) > > > >> ... > > > >> at java.lang.Thread.run(java.base@11.0.5/Thread.java:834) > > > >> > > > >> Locked ownable synchronizers: > > > >> - None > > > >> > > > >> and all other threads are stuck like this: > > > >> "xx-nioW-16" #273 prio=5 os_prio=0 cpu=7439.15ms elapsed=43966.67s > > > >> tid=0x00007f4c2e4c7000 nid=0x64a9 waiting for monitor entry > > > >> [0x00007f49f6ad2000] > > > >> java.lang.Thread.State: BLOCKED (on object monitor) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLoggerDisruptor.enqueueLogMessageWhenQueueFull(AsyncLoggerDisruptor.java:230) > > > >> - waiting to lock <0x0000000701017288> (a java.lang.Object) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.handleRingBufferFull(AsyncLogger.java:246) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.publish(AsyncLogger.java:230) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.logWithThreadLocalTranslator(AsyncLogger.java:225) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.access$000(AsyncLogger.java:67) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger$1.log(AsyncLogger.java:152) > > > >> at > > > >> org.apache.logging.log4j.core.async.AsyncLogger.log(AsyncLogger.java:136) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.tryLogMessage(AbstractLogger.java:2198) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.logMessageTrackRecursion(AbstractLogger.java:2152) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.logMessageSafely(AbstractLogger.java:2135) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.logMessage(AbstractLogger.java:2022) > > > >> at > > > >> org.apache.logging.log4j.spi.AbstractLogger.logIfEnabled(AbstractLogger.java:1891) > > > >> at > > > >> org.apache.logging.slf4j.Log4jLogger.info(Log4jLogger.java:194) > > > >> ... > > > >> > > > >> Does this ring any bells? > > > >> > > > >> Thank you in advance! > > > > > > > > > > > > > > > > --------------------------------------------------------------------- > > > To unsubscribe, e-mail: log4j-user-unsubscr...@logging.apache.org > > > For additional commands, e-mail: log4j-user-h...@logging.apache.org > > > > > > > --------------------------------------------------------------------- > > To unsubscribe, e-mail: log4j-user-unsubscr...@logging.apache.org > > For additional commands, e-mail: log4j-user-h...@logging.apache.org > > > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: log4j-user-unsubscr...@logging.apache.org > For additional commands, e-mail: log4j-user-h...@logging.apache.org > --------------------------------------------------------------------- To unsubscribe, e-mail: log4j-user-unsubscr...@logging.apache.org For additional commands, e-mail: log4j-user-h...@logging.apache.org