The AIX system clock is not the same base time as most Intel boxes. What is the sleep time in the sleep strategy? If it's being derived, it might be too small. ???
Just a thought. To further complicate matters, this particular lpar was uncapped, which means that it can steal CPU from other lpars that are not as busy. So the number of active CPU's can dynamically vary. -Chris On Fri, Mar 21, 2014 at 9:01 AM, Remko Popma <[email protected]> wrote: > No, it turned out that the docs for Apache Archiva were incorrect and the > WaitStrategy was effectively still SleepingWaitStrategy. Using the correct > config to specify BlockingWaitStrategy solved the issue. (LOG4J2-571) > > FYI, the wait strategy determines what the consumer thread does when the > queue is empty & it's waiting for events. Some specialized apps want to > avoid the latency of waking up a blocked thread, so there are a number of > options with different trade-offs, with busy spin on one end of the > spectrum and blocking on the other. For log4j I reduced the set of > available options (no busy spin), and choose SleepingWait as the default. > This had the best performance in my testing. Until now I hadn't seen any > excessive CPU usage. > > Sent from my iPhone > > > On 2014/03/20, at 22:10, Matt Sicker <[email protected]> wrote: > > > > Perhaps lmax disruptor doesn't work properly in the IBM JVM? > > > >> On Tuesday, 18 March 2014, Chris Graham <[email protected]> wrote: > >> > >> JStack is a Sun thing. This is the IBM JDK on AIX. > >> I've run the tprof command twice and verified it. > >> > >> The full work though follows. > >> > >> The output from topas (same as top, effectively) is: > >> > >> Topas Monitor for host: XXXXXXXXXXXXXXX EVENTS/QUEUES > FILE/TTY > >> Wed Mar 19 14:49:55 2014 Interval: 2 Cswitch 3581 > >> Readch 686 > >> Syscall 2763 Writech > >> 1378 > >> CPU User% Kern% Wait% Idle% Physc Entc Reads 7 > >> Rawin 0 > >> ALL 48.8 1.2 0.0 50.1 1.03 51.7 Writes 5 > >> Ttyout 643 > >> Forks 0 > >> Igets 0 > >> Network KBPS I-Pack O-Pack KB-In KB-Out Execs 0 > >> Namei 81 > >> Total 2.8 10.0 7.5 1.1 1.7 Runqueue 1.0 > >> Dirblk 0 > >> Waitqueue 0.0 > >> Disk Busy% KBPS TPS KB-Read KB-Writ MEMORY > >> Total 0.0 0.0 0.0 0.0 0.0 PAGING Real,MB > >> 20480 > >> Faults 3 % Comp > >> 44 > >> FileSystem KBPS TPS KB-Read KB-Writ Steals 0 % > Noncomp > >> 54 > >> Total 0.6 1.5 0.6 0.0 PgspIn 0 % > Client > >> 54 > >> PgspOut 0 > >> Name PID CPU% PgSp Owner PageIn 0 PAGING > >> SPACE > >> java 9437312 48.6 739.1 wasadmin PageOut 0 Size,MB > >> 15552 > >> > >> Process with a PID of 9437312 is the WebSphere instance that runs > Archiva, > >> the problem in question. > >> > >> We then use a native AIX tool, tprof to examine that process, and see > what > >> it's doing: > >> > >> tprof -j -P 9437312 -skex sleep 60 > >> > >> This generates sleep.prof, and the relevant section is: > >> > >> Configuration information > >> ========================= > >> System: AIX 6.1 Node: au02qap207teax2 Machine: 00C43D204C00 > >> Tprof command was: > >> tprof -j -P 9437312 -skex sleep 60 > >> Trace command was: > >> /usr/bin/trace -ad -M -L 2438933299 -T 500000 -j > >> 00A,001,002,003,38F,005,006,134,210,139,5A2,5A5,465,234,5D8, -o - > >> Total Samples = 24008 > >> Traced Time = 60.02s (out of a total execution time of 60.02s) > >> > >> > <<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<<< > >> > >> Process Freq Total Kernel User Shared Other > >> Java > >> ======= ==== ===== ====== ==== ====== ===== > >> ==== > >> wait 4 64.29 64.29 0.00 0.00 0.00 > >> 0.00 > >> java 182 34.66 1.97 0.00 32.65 0.05 > >> 0.00 > >> /usr/sbin/syncd 2 0.17 0.17 0.00 0.00 0.00 > >> 0.00 > >> sshd: 1 0.14 0.09 0.05 0.01 0.00 > >> 0.00 > >> /usr/bin/ps 13 0.12 0.10 0.00 0.03 0.00 > >> 0.00 > >> /usr/bin/sh 27 0.12 0.11 0.00 0.00 0.00 > >> 0.00 > >> httpd 12 0.09 0.05 0.01 0.03 0.00 > >> 0.00 > >> /usr/bin/scp 1 0.09 0.07 0.00 0.01 0.00 > >> 0.00 > >> /opt/freeware/bin/readlink 6 0.04 0.04 0.00 0.00 0.00 > >> 0.00 > >> /usr/bin/tprof 1 0.03 0.00 0.01 0.02 0.00 > >> 0.00 > >> kulagent 7 0.03 0.02 0.00 0.00 0.00 > >> 0.00 > >> swapper 1 0.02 0.02 0.00 0.00 0.00 > >> 0.00 > >> /usr/bin/grep 5 0.02 0.02 0.00 0.00 0.00 > >> 0.00 > >> /usr/sbin/getty 1 0.02 0.02 0.00 0.00 0.00 > >> 0.00 > >> kuxagent 3 0.02 0.01 0.00 0.00 0.00 > >> 0.00 > >> lrud 1 0.02 0.02 0.00 0.00 0.00 > >> 0.00 > >> srmAIX 3 0.01 0.01 0.00 0.00 0.00 > >> 0.00 > >> /opt/tivoli/cit/bin/wscanuse 1 0.01 0.01 0.00 0.00 0.00 > >> 0.00 > >> psmd 1 0.01 0.01 0.00 0.00 0.00 > >> 0.00 > >> xmgc 1 0.01 0.01 0.00 0.00 0.00 > >> 0.00 > >> nfs_stat 2 0.01 0.01 0.00 0.00 0.00 > >> 0.00 > >> /usr/bin/topasrec 2 0.01 0.01 0.00 0.00 0.00 > >> 0.00 > >> /usr/bin/xmtopas 1 0.01 0.00 0.00 0.00 0.00 > >> 0.00 > >> kcawd 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> /usr/sbin/snmpd 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> /usr/sbin/snmpmibd 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> /usr/bin/date 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> /usr/bin/entstat 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> gil 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> /usr/bin/vmstat 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> /usr/bin/ln 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> stat_daemon 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> j2pg 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> /usr/sbin/netstat 1 0.00 0.00 0.00 0.00 0.00 > >> 0.00 > >> ======= ==== ===== ====== ==== ====== ===== > >> ==== > >> Total 289 100.00 67.11 0.07 32.77 0.05 > >> 0.00 > >> > >> Process PID TID Total Kernel User Shared Other > >> Java > >> ======= === === ===== ====== ==== ====== ===== > >> ==== > >> java 9437312 46072057 23.76 0.05 0.00 23.70 0.00 > >> 0.00 > >> > >> Which gives me the needed TID, 46072057 = 0x2BF00F9. > >> > >> So to trigger a non-fatal heap dump: > >> > >> kill -3 9437312 > >> > >> and then we look into the created > javacore.20140318.173250.9437312.0002.txt > >> file for references to 0x2BF00F9: > >> > >> 3XMTHREADINFO "AsyncLoggerConfig-1" J9VMThread:0x0000000031D14600, > >> j9thread_t:0x00000100137D8BD0, java/lang/Thread:0x000000004301C508, > >> state:CW, prio=5 > >> 3XMJAVALTHREAD (java/lang/Thread getId:0x6A, isDaemon:true) > >> 3XMTHREADINFO1 (native thread ID:0x2BF00F9, native > priority:0x5, > >> native policy:UNKNOWN) > >> 3XMHEAPALLOC Heap bytes allocated since last GC cycle=0 > (0x0) > >> 3XMTHREADINFO3 Java callstack: > >> 4XESTACKTRACE at sun/misc/Unsafe.park(Native Method) > >> 4XESTACKTRACE at > >> java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:332) > >> 4XESTACKTRACE at > >> > >> > com/lmax/disruptor/SleepingWaitStrategy.applyWaitMethod(SleepingWaitStrategy.java:66) > >> 4XESTACKTRACE at > >> > >> > com/lmax/disruptor/SleepingWaitStrategy.waitFor(SleepingWaitStrategy.java:39) > >> 4XESTACKTRACE at > >> > >> > com/lmax/disruptor/ProcessingSequenceBarrier.waitFor(ProcessingSequenceBarrier.java:55) > >> 4XESTACKTRACE at > >> com/lmax/disruptor/BatchEventProcessor.run(BatchEventProcessor.java:115) > >> 4XESTACKTRACE at > >> > >> > java/util/concurrent/ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:895) > >> 4XESTACKTRACE at > >> > >> > java/util/concurrent/ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:918) > >> 4XESTACKTRACE at java/lang/Thread.run(Thread.java:773) > >> 3XMTHREADINFO3 Native callstack: > >> 4XENATIVESTACK _event_wait+0x2b8 (0x09000000007E7D3C > >> [libpthreads.a+0x16d3c]) > >> 4XENATIVESTACK _cond_wait_local+0x4e4 (0x09000000007F5A48 > >> [libpthreads.a+0x24a48]) > >> 4XENATIVESTACK _cond_wait+0xbc (0x09000000007F6020 > >> [libpthreads.a+0x25020]) > >> 4XENATIVESTACK pthread_cond_wait+0x1a8 (0x09000000007F6C8C > >> [libpthreads.a+0x25c8c]) > >> 4XENATIVESTACK (0x0900000001223014 [libj9thr26.so+0x6014]) > >> 4XENATIVESTACK (0x0900000001222C60 [libj9thr26.so+0x5c60]) > >> 4XENATIVESTACK (0x090000000116AE58 [libj9vm26.so+0xfe58]) > >> 4XENATIVESTACK (0x090000000116B17C [libj9vm26.so+0x1017c]) > >> 4XENATIVESTACK (0x0900000001810528 > >> [libjclscar_26.so+0x5c528]) > >> 4XENATIVESTACK (0x0900000001813B98 > >> [libjclscar_26.so+0x5fb98]) > >> 4XENATIVESTACK (0x0900000001161764 [libj9vm26.so+0x6764]) > >> 4XENATIVESTACK (0x0900000001239CA0 [libj9prt26.so+0x2ca0]) > >> 4XENATIVESTACK (0x09000000011615D4 [libj9vm26.so+0x65d4]) > >> 4XENATIVESTACK (0x090000000121FAF4 [libj9thr26.so+0x2af4]) > >> 4XENATIVESTACK _pthread_body+0xf0 (0x09000000007D4D34 > >> [libpthreads.a+0x3d34]) > >> NULL > >> > >> Would you like me to attach the complete sleep.tprof and javacore.txt > file > >> to the Jira ticket that I just created? > >> > >> https://issues.apache.org/jira/browse/LOG4J2-571 > >> > >> Thanks for looking, this has been driving me nuts. > >> > >> -Chris > >> > >> > >> > >> On Wed, Mar 19, 2014 at 2:45 PM, Remko Popma <[email protected]> > >> wrote: > >> > >>> Just to double-check: when using blocking wait, and running jstack, > does > >>> BlockingWaitStrategy appear in the stack trace? > >>> Also, it it possible to double-check (perhaps attach VisualVM) that it > >>> definitely is the AsyncLoggerConfig-1 thread that consumes so much CPU? > >>> > >>> > >>> Sent from my iPhone > >>> > >>>> On 2014/03/19, at 12:31, Chris Graham <[email protected]> wrote: > >>>> > >>>> I have tried both Block and Sleep (the default), but not Yield. No > >>>> discernable difference. > >>>> > >>>> > >>>>> On Wed, Mar 19, 2014 at 1:53 PM, Remko Popma <[email protected]> > >>> wrote: > >>>>> > >>>>> As tweeted, I suggest trying the blocking wait strategy. Can you run > a > >>>>> jstack dump (and perhaps attach result to a Jira ticket)? In the > >>> attached > >>>>> stack trace below, the AsyncLoggerConfig-1 thread seems to be parked, > >>>>> waiting for a new log event... Doesn't explain high CPU usage... > >>>>> > >>>>> Sent from my iPhone > >>>>> > >>>>>> On 2014/03/19, at 10:27, Chris Graham <[email protected]> wrote: > >>>>>> > >>>>>> Hello Everyone. > >>>>>> > >>>>>> In this instance, I'm in indirect used of log4j2 2.0-rc1, as it's in > >>> the > >>>>>> web app that I'm using, Apache Archiva 2.0.1. > >>>>>> > >>>>>> The issue is that when running under WebSphere 8.5.0.2 (obviously on > >>> the > >>>>>> IBM JDK, 1.6) on AIX 6.1 TL8, Apache Archiva when it's doing > nothing, > >>> is > >>>>>> sitting idle on around 50% CPU. > >>>>>> > >>>>>> Obviosuly, this is not good! > >>>>>> > >>>>>> I've performed the AIX native analysis, to get the native thread ID, > >>>>> mapped > >>>>>> it to a Java thread it, triggered a heap dump, and I've found this > as > >>> the > >>>>>> culprit: > >>>>>> > >>>>>> 3XMTHREADINFO "AsyncLoggerConfig-1" > >> J9VMThread:0x0000000031D14600, > >>>>>> j9thread_t:0x00000100137D8BD0, java/lang/Thread:0x000000004301C508, > >>>>>> state:CW, prio=5 > >>>>>> 3XMJAVALTHREAD (java/lang/Thread getId:0x6A, > >> isDaemon:true) > >>>>>> 3XMTHREADINFO1 (native thread ID:0x2BF00F9, native > >>>>> priority:0x5, > >>>>>> native policy:UNKNOWN) > >>>>>> 3XMHEAPALLOC Heap bytes allocated since last GC cycle=0 > >>> (0x0) > >>>>>> 3XMTHREADINFO3 Java callstack: > >>>>>> 4XESTACKTRACE at sun/misc/Unsafe.park(Native Method) > >>>>>> 4XESTACKTRACE at > >> java/util/concurrent/locks/LockSupport.parkNanos(LockSupport.java:332) > >>>>>> 4XESTACKTRACE at > >> > com/lmax/disruptor/SleepingWaitStrategy.applyWaitMethod(SleepingWaitStrategy.java:66) > >>>>>> 4XESTACKTRACE > > > > > > > > -- > > Matt Sicker <[email protected]> > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [email protected] > For additional commands, e-mail: [email protected] > >
