Took another look at the Disruptor SleepingWait strategy. It actually uses
a back-off strategy. From the javadoc:

Sleeping strategy that initially spins, then uses a Thread.yield(), and
eventually for the minimum number of nanos the OS and JVM will allow while
the {@link com.lmax.disruptor.EventProcessor}s are waiting on a barrier.
This strategy is a good compromise between performance and CPU resource.
Latency spikes can occur after quiet periods.

The Disruptor SleepingWait strategy code eventually calls LockSupport.
parkNanos(1L);
Different platforms have different timer resolution (I think Windows cannot
go smaller than ~15 millis), and it is possible that AIX has a more
accurate clock.

I'm beginning to think perhaps BlockingWait should be the default for log4j.

On Fri, Mar 21, 2014 at 9:33 AM, Chris Graham <[email protected]> wrote:

> The AIX system clock is not the same base time as most Intel boxes. What is
> the sleep time in the sleep strategy? If it's being derived, it might be
> too small. ???
>
> Just a thought.
>
> To further complicate matters, this particular lpar was uncapped, which
> means that it can steal CPU from other lpars that are not as busy. So the
> number of active CPU's can dynamically vary.
>
> -Chris
>
>
>
> On Fri, Mar 21, 2014 at 9:01 AM, Remko Popma <[email protected]>
> wrote:
>
> > No, it turned out that the docs for Apache Archiva were incorrect and the
> > WaitStrategy was effectively still SleepingWaitStrategy. Using the
> correct
> > config to specify BlockingWaitStrategy solved the issue. (LOG4J2-571)
> >
> > FYI, the wait strategy determines what the consumer thread does when the
> > queue is empty & it's waiting for events. Some specialized apps want to
> > avoid the latency of waking up a blocked thread, so there are a number of
> > options with different trade-offs, with busy spin on one end of the
> > spectrum and blocking on the other. For log4j I reduced the set of
> > available options (no busy spin), and choose  SleepingWait as the
> default.
> > This had the best performance in my testing. Until now I hadn't seen any
> > excessive CPU usage.
> >
> > Sent from my iPhone
> >
> > > On 2014/03/20, at 22:10, Matt Sicker <[email protected]> wrote:
> > >
> > > Perhaps lmax disruptor doesn't work properly in the IBM JVM?
> > >
> > >> On Tuesday, 18 March 2014, Chris Graham <[email protected]> wrote:
> > >>
> > >> JStack is a Sun thing. This is the IBM JDK on AIX.
> > >> I've run the tprof command twice and verified it.
> > >>
> > >> The full work though follows.
> > >>
> > >> The output from topas (same as top, effectively) is:
> > >>
> > >> Topas Monitor for host:    XXXXXXXXXXXXXXX      EVENTS/QUEUES
> >  FILE/TTY
> > >> Wed Mar 19 14:49:55 2014   Interval:  2         Cswitch    3581
> > >> Readch      686
> > >>                                                Syscall    2763
>  Writech
> > >> 1378
> > >> CPU  User%  Kern%  Wait%  Idle%  Physc   Entc   Reads         7
> > >> Rawin         0
> > >> ALL   48.8    1.2    0.0   50.1   1.03   51.7   Writes        5
> > >> Ttyout      643
> > >>                                                Forks         0
> > >> Igets         0
> > >> Network  KBPS   I-Pack  O-Pack   KB-In  KB-Out  Execs         0
> > >> Namei        81
> > >> Total     2.8     10.0     7.5     1.1     1.7  Runqueue    1.0
> > >> Dirblk        0
> > >>                                                Waitqueue   0.0
> > >> Disk    Busy%     KBPS     TPS KB-Read KB-Writ
> MEMORY
> > >> Total     0.0      0.0     0.0     0.0     0.0  PAGING
> Real,MB
> > >> 20480
> > >>                                                Faults        3  % Comp
> > >> 44
> > >> FileSystem        KBPS     TPS KB-Read KB-Writ  Steals        0  %
> > Noncomp
> > >> 54
> > >> Total              0.6     1.5    0.6    0.0    PgspIn        0  %
> > Client
> > >> 54
> > >>                                                PgspOut       0
> > >> Name            PID  CPU%  PgSp Owner           PageIn        0
>  PAGING
> > >> SPACE
> > >> java        9437312  48.6 739.1 wasadmin        PageOut       0
>  Size,MB
> > >> 15552
> > >>
> > >> Process with a PID of 9437312 is the WebSphere instance that runs
> > Archiva,
> > >> the problem in question.
> > >>
> > >> We then use a native AIX tool, tprof to examine that process, and see
> > what
> > >> it's doing:
> > >>
> > >> tprof -j -P 9437312 -skex sleep 60
> > >>
> > >> This generates sleep.prof, and the relevant section is:
> > >>
> > >> Configuration information
> > >> =========================
> > >> System: AIX 6.1 Node: au02qap207teax2 Machine: 00C43D204C00
> > >> Tprof command was:
> > >>    tprof -j -P 9437312 -skex sleep 60
> > >> Trace command was:
> > >>    /usr/bin/trace -ad -M -L 2438933299 -T 500000 -j
> > >> 00A,001,002,003,38F,005,006,134,210,139,5A2,5A5,465,234,5D8, -o -
> > >> Total Samples = 24008
> > >> Traced Time = 60.02s (out of a total execution time of 60.02s)
> > >>
> > >>
> >
> <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<
> > >>
> > >> Process                          Freq  Total Kernel   User Shared
>  Other
> > >> Java
> > >> =======                          ====  ===== ======   ==== ======
>  =====
> > >> ====
> > >> wait                                4  64.29  64.29   0.00   0.00
> 0.00
> > >> 0.00
> > >> java                              182  34.66   1.97   0.00  32.65
> 0.05
> > >> 0.00
> > >> /usr/sbin/syncd                     2   0.17   0.17   0.00   0.00
> 0.00
> > >> 0.00
> > >> sshd:                               1   0.14   0.09   0.05   0.01
> 0.00
> > >> 0.00
> > >> /usr/bin/ps                        13   0.12   0.10   0.00   0.03
> 0.00
> > >> 0.00
> > >> /usr/bin/sh                        27   0.12   0.11   0.00   0.00
> 0.00
> > >> 0.00
> > >> httpd                              12   0.09   0.05   0.01   0.03
> 0.00
> > >> 0.00
> > >> /usr/bin/scp                        1   0.09   0.07   0.00   0.01
> 0.00
> > >> 0.00
> > >> /opt/freeware/bin/readlink          6   0.04   0.04   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/bin/tprof                      1   0.03   0.00   0.01   0.02
> 0.00
> > >> 0.00
> > >> kulagent                            7   0.03   0.02   0.00   0.00
> 0.00
> > >> 0.00
> > >> swapper                             1   0.02   0.02   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/bin/grep                       5   0.02   0.02   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/sbin/getty                     1   0.02   0.02   0.00   0.00
> 0.00
> > >> 0.00
> > >> kuxagent                            3   0.02   0.01   0.00   0.00
> 0.00
> > >> 0.00
> > >> lrud                                1   0.02   0.02   0.00   0.00
> 0.00
> > >> 0.00
> > >> srmAIX                              3   0.01   0.01   0.00   0.00
> 0.00
> > >> 0.00
> > >> /opt/tivoli/cit/bin/wscanuse        1   0.01   0.01   0.00   0.00
> 0.00
> > >> 0.00
> > >> psmd                                1   0.01   0.01   0.00   0.00
> 0.00
> > >> 0.00
> > >> xmgc                                1   0.01   0.01   0.00   0.00
> 0.00
> > >> 0.00
> > >> nfs_stat                            2   0.01   0.01   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/bin/topasrec                   2   0.01   0.01   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/bin/xmtopas                    1   0.01   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> kcawd                               1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/sbin/snmpd                     1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/sbin/snmpmibd                  1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/bin/date                       1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/bin/entstat                    1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> gil                                 1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/bin/vmstat                     1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/bin/ln                         1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> stat_daemon                         1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> j2pg                                1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> /usr/sbin/netstat                   1   0.00   0.00   0.00   0.00
> 0.00
> > >> 0.00
> > >> =======                          ====  ===== ======   ==== ======
>  =====
> > >> ====
> > >> Total                             289 100.00  67.11   0.07  32.77
> 0.05
> > >> 0.00
> > >>
> > >> Process               PID      TID  Total Kernel   User Shared  Other
> > >> Java
> > >> =======               ===      ===  ===== ======   ==== ======  =====
> > >> ====
> > >> java              9437312 46072057  23.76   0.05   0.00  23.70   0.00
> > >> 0.00
> > >>
> > >> Which gives me the needed TID, 46072057 = 0x2BF00F9.
> > >>
> > >> So to trigger a non-fatal heap dump:
> > >>
> > >> kill -3 9437312
> > >>
> > >> and then we look into the created
> > javacore.20140318.173250.9437312.0002.txt
> > >> file for references to 0x2BF00F9:
> > >>
> > >> 3XMTHREADINFO      "AsyncLoggerConfig-1"
> J9VMThread:0x0000000031D14600,
> > >> j9thread_t:0x00000100137D8BD0, java/lang/Thread:0x000000004301C508,
> > >> state:CW, prio=5
> > >> 3XMJAVALTHREAD            (java/lang/Thread getId:0x6A, isDaemon:true)
> > >> 3XMTHREADINFO1            (native thread ID:0x2BF00F9, native
> > priority:0x5,
> > >> native policy:UNKNOWN)
> > >> 3XMHEAPALLOC             Heap bytes allocated since last GC cycle=0
> > (0x0)
> > >> 3XMTHREADINFO3           Java callstack:
> > >> 4XESTACKTRACE                at sun/misc/Unsafe.park(Native Method)
> > >> 4XESTACKTRACE                at
> > >> java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:332)
> > >> 4XESTACKTRACE                at
> > >>
> > >>
> >
> com/lmax/disruptor/SleepingWaitStrategy.applyWaitMethod(SleepingWaitStrategy.java:66)
> > >> 4XESTACKTRACE                at
> > >>
> > >>
> >
> com/lmax/disruptor/SleepingWaitStrategy.waitFor(SleepingWaitStrategy.java:39)
> > >> 4XESTACKTRACE                at
> > >>
> > >>
> >
> com/lmax/disruptor/ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:55)
> > >> 4XESTACKTRACE                at
> > >>
> com/lmax/disruptor/BatchEventProcessor.run(BatchEventProcessor.java:115)
> > >> 4XESTACKTRACE                at
> > >>
> > >>
> >
> java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895)
> > >> 4XESTACKTRACE                at
> > >>
> > >>
> >
> java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918)
> > >> 4XESTACKTRACE                at java/lang/Thread.run(Thread.java:773)
> > >> 3XMTHREADINFO3           Native callstack:
> > >> 4XENATIVESTACK               _event_wait+0x2b8 (0x09000000007E7D3C
> > >> [libpthreads.a+0x16d3c])
> > >> 4XENATIVESTACK               _cond_wait_local+0x4e4
> (0x09000000007F5A48
> > >> [libpthreads.a+0x24a48])
> > >> 4XENATIVESTACK               _cond_wait+0xbc (0x09000000007F6020
> > >> [libpthreads.a+0x25020])
> > >> 4XENATIVESTACK               pthread_cond_wait+0x1a8
> (0x09000000007F6C8C
> > >> [libpthreads.a+0x25c8c])
> > >> 4XENATIVESTACK               (0x0900000001223014
> [libj9thr26.so+0x6014])
> > >> 4XENATIVESTACK               (0x0900000001222C60
> [libj9thr26.so+0x5c60])
> > >> 4XENATIVESTACK               (0x090000000116AE58
> [libj9vm26.so+0xfe58])
> > >> 4XENATIVESTACK               (0x090000000116B17C
> [libj9vm26.so+0x1017c])
> > >> 4XENATIVESTACK               (0x0900000001810528
> > >> [libjclscar_26.so+0x5c528])
> > >> 4XENATIVESTACK               (0x0900000001813B98
> > >> [libjclscar_26.so+0x5fb98])
> > >> 4XENATIVESTACK               (0x0900000001161764
> [libj9vm26.so+0x6764])
> > >> 4XENATIVESTACK               (0x0900000001239CA0
> [libj9prt26.so+0x2ca0])
> > >> 4XENATIVESTACK               (0x09000000011615D4
> [libj9vm26.so+0x65d4])
> > >> 4XENATIVESTACK               (0x090000000121FAF4
> [libj9thr26.so+0x2af4])
> > >> 4XENATIVESTACK               _pthread_body+0xf0 (0x09000000007D4D34
> > >> [libpthreads.a+0x3d34])
> > >> NULL
> > >>
> > >> Would you like me to attach the complete sleep.tprof and javacore.txt
> > file
> > >> to the Jira ticket that I just created?
> > >>
> > >> https://issues.apache.org/jira/browse/LOG4J2-571
> > >>
> > >> Thanks for looking, this has been driving me nuts.
> > >>
> > >> -Chris
> > >>
> > >>
> > >>
> > >> On Wed, Mar 19, 2014 at 2:45 PM, Remko Popma <[email protected]>
> > >> wrote:
> > >>
> > >>> Just to double-check: when using blocking wait, and running jstack,
> > does
> > >>> BlockingWaitStrategy appear in the stack trace?
> > >>> Also, it it possible to double-check (perhaps attach VisualVM) that
> it
> > >>> definitely is the AsyncLoggerConfig-1 thread that consumes so much
> CPU?
> > >>>
> > >>>
> > >>> Sent from my iPhone
> > >>>
> > >>>> On 2014/03/19, at 12:31, Chris Graham <[email protected]> wrote:
> > >>>>
> > >>>> I have tried both Block and Sleep (the default), but not Yield. No
> > >>>> discernable difference.
> > >>>>
> > >>>>
> > >>>>> On Wed, Mar 19, 2014 at 1:53 PM, Remko Popma <
> [email protected]>
> > >>> wrote:
> > >>>>>
> > >>>>> As tweeted, I suggest trying the blocking wait strategy. Can you
> run
> > a
> > >>>>> jstack dump (and perhaps attach result to a Jira ticket)? In the
> > >>> attached
> > >>>>> stack trace below, the AsyncLoggerConfig-1 thread seems to be
> parked,
> > >>>>> waiting for a new log event... Doesn't explain high CPU usage...
> > >>>>>
> > >>>>> Sent from my iPhone
> > >>>>>
> > >>>>>> On 2014/03/19, at 10:27, Chris Graham <[email protected]>
> wrote:
> > >>>>>>
> > >>>>>> Hello Everyone.
> > >>>>>>
> > >>>>>> In this instance, I'm in indirect used of log4j2 2.0-rc1, as it's
> in
> > >>> the
> > >>>>>> web app that I'm using, Apache Archiva 2.0.1.
> > >>>>>>
> > >>>>>> The issue is that when running under WebSphere 8.5.0.2 (obviously
> on
> > >>> the
> > >>>>>> IBM JDK, 1.6) on AIX 6.1 TL8, Apache Archiva when it's doing
> > nothing,
> > >>> is
> > >>>>>> sitting idle on around 50% CPU.
> > >>>>>>
> > >>>>>> Obviosuly, this is not good!
> > >>>>>>
> > >>>>>> I've performed the AIX native analysis, to get the native thread
> ID,
> > >>>>> mapped
> > >>>>>> it to a Java thread it, triggered a heap dump, and I've found this
> > as
> > >>> the
> > >>>>>> culprit:
> > >>>>>>
> > >>>>>> 3XMTHREADINFO      "AsyncLoggerConfig-1"
> > >> J9VMThread:0x0000000031D14600,
> > >>>>>> j9thread_t:0x00000100137D8BD0,
> java/lang/Thread:0x000000004301C508,
> > >>>>>> state:CW, prio=5
> > >>>>>> 3XMJAVALTHREAD            (java/lang/Thread getId:0x6A,
> > >> isDaemon:true)
> > >>>>>> 3XMTHREADINFO1            (native thread ID:0x2BF00F9, native
> > >>>>> priority:0x5,
> > >>>>>> native policy:UNKNOWN)
> > >>>>>> 3XMHEAPALLOC             Heap bytes allocated since last GC
> cycle=0
> > >>> (0x0)
> > >>>>>> 3XMTHREADINFO3           Java callstack:
> > >>>>>> 4XESTACKTRACE                at sun/misc/Unsafe.park(Native
> Method)
> > >>>>>> 4XESTACKTRACE                at
> > >> java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:332)
> > >>>>>> 4XESTACKTRACE                at
> > >>
> >
> com/lmax/disruptor/SleepingWaitStrategy.applyWaitMethod(SleepingWaitStrategy.java:66)
> > >>>>>> 4XESTACKTRACE
> > >
> > >
> > >
> > > --
> > > Matt Sicker <[email protected]>
> >
> > ---------------------------------------------------------------------
> > To unsubscribe, e-mail: [email protected]
> > For additional commands, e-mail: [email protected]
> >
> >
>

Reply via email to