Re: [dtrace-discuss] debugging a faulty jboss/application

Adam Leventhal Wed, 26 Nov 2008 14:30:35 -0800

Is java running with the -server option on x86? There's a known bug  
where jstack() can't capture stacks
On Nov 24, 2008, at 11:57 AM, Fernando Gleiser wrote:


>
>> From time to time a jboss process would end eating all, the  
>> available CPU and the load avg would skyrocket.
>
> Once the operators restarted jboss, the system'd be normal again  
> (sometimes for weeks) until the next incident
>
> Since we moved the app from a v440 running Solaris 10 8/07 to a  
> t2000 running Solaris 10 5/08 the problem started to happen more  
> frequently (2-3 times a week). The only other modification (besides  
> hard and OS release) is we put the app server inside a sparse-zone  
> container with the FSS and a 95% limit on the CPU usage.
>
> Here's what uptime says:
>
> Mon Nov 24 10:10:00 ARST 2008
> 10:10am  up 10 day(s), 14:40,  6 users,  load average: 325.52,  
> 320.07, 318.72
>
> (yes, load avg is almost 320, but the server was still usable)
>
> a little output from mpstat shows this:
>
> CPU minf mjf xcal  intr ithr  csw icsw migr smtx  srw syscl  usr  
> sys  wt idl
>  0    0   0  531   259  150   46   27   17   43    0    29   99    
> 1   0   0
>  1    0   0    3    66    0  124   65   36   10    0    86   99    
> 1   0   0
>  2    0   0    0    23    0   34   22   17    9    0     8  100    
> 0   0   0
>  3    0   0    0    13    0   13   12    6    9    0     1  100    
> 0   0   0
>  4    0   0    0    13    0   14   12    9    9    0     2  100    
> 0   0   0
>  5    0   0    0    38    0   69   37   17    8    0    23  100    
> 0   0   0
>  6    0   0    0    25    0   40   24   13    8    0     7   95    
> 5   0   0
>  7    0   0    0    15    0   17   14   10    9    0     3  100    
> 0   0   0
>  8    0   0    0    11    0   10   10    6    8    0     0  100    
> 0   0   0
>  9    0   0    0    12    0   11   11    7    9    0     0  100    
> 0   0   0
> 10    0   0    0    14    0   13   13    7    9    0     0  100    
> 0   0   0
> 11    0   0    0    13    0   12   12    6    9    0     0  100    
> 0   0   0
> 12    0   0    0    25    0   37   24   14    8    0    25  100    
> 0   0   0
> 13    0   0    0    13    0   13   12    9    9    0     1  100    
> 0   0   0
> 14    0   0    0    13    0   12   12    4    9    0     0  100    
> 0   0   0
> 15    0   0    0    13    0   12   12    7    9    0     0  100    
> 0   0   0
> 16    0   0    0    12    0   11   11    5    7    0     0  100    
> 0   0   0
> 17    0   0    0    14    0   13   13    8    7    0     0  100    
> 0   0   0
> 18    0   0    0    13    0   12   12    8    9    0     0  100    
> 0   0   0
> 19    0   0    1    13    0   12   12    7    8    0     0  100    
> 0   0   0
> 20    0   0    7    25    0   35   25   13    9    0     3  100    
> 0   0   0
> 21    0   0   72    45    3   78   41   29   11    0    19   99    
> 1   0   0
> 22    0   0    3    44    4   71   39   21   11    0   135  100    
> 0   0   0
> 23    0   0    3    20    5   16   14    7    9    0     2  100    
> 0   0   0
> 24    0   0    3    17    5   13   12    6    9    0     2   99    
> 1   0   0
> 25    0   0    1    15    4   10   10    8    8    0     0  100    
> 0   0   0
> 26    0   0    8    15    0   17   14    5   10    0     0  100    
> 0   0   0
> 27    0   0    0    12    0   11   11    7    8    0     0  100    
> 0   0   0
> 28    0   0    1    11    0   10   10    6    8    0     0  100    
> 0   0   0
> 29    0   0    2    67    0  126   66   30   11    0    48  100    
> 0   0   0
> 30    0   0    0    19    0   25   18   11    8    0   232  100    
> 0   0   0
> 31    0   0    1    27    0   47   26   12   10    0    10  100    
> 0   0   0
>
>
>
> So, 100% cpu in usr mode looks a lot like some kind of infinite loop
>
> I tried to dig it up using dtrace:
>
> profile-1001us
>
> /(execname=="java") && (pid==$1)/
>
> {
>
>        @[jstack()]=count();
>
> }
>
>
>
> tick-20s
>
> {
>
>        exit(O);
>
> }
>
>
> But I got a LOT of these arrors:
>
> dtrace: error on enabled probe ID 1 (ID 52070:  
> profile:::Profile-1001us): invalid address (0x96a7e000) in action #2
>
> and th stack traces are mostly the hex addresses without the names:
>
>
>              0xf886ca64
>              0xf884cec0
>              0xf8805d3c
>              0xf8805874
>              0xf8805c70
>              0xf8805c70
>              0xf8805d3c
>              0xf88380b8
>              0xf9509440
>              0xf88b7b44
>              0xf8805874
>              0xf8805874
>              0xf8805764
>              0xf8805874
>              0xf8805874
>              0xf8805d3c
>              0xf8805874
>              0xf8805874
>              0xf8805874
>              0xf8805874
>              0xf8805874
>              0xf8805874
>              0xf8805764
>              0xf8805764
>              0xf8805764
>              0xf886993c
>              0xf9999004
>              0xf884cec0
>              0xf8805874
>              0xf8805d3c
>              0xf884cc7c
>              0xf994090c
>              0xf8838df8
>              0xf8800218
>               
> libjvm 
> .so 
> ` 
> __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_
>  
> +0x5a0
>              libjvm.so`JVM_DoPrivileged+0x500
>               
> libjava 
> .so 
> ` 
> Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedExceptionAction_2Ljava_security_AccessControlContext_2
>  
> +0x14
>              0xf880e22c
>              0xf880e1d0
>              0xf89c6470
>              0xf9dfc018
>              0xf99ee350
>              0xf9d74fd8
>              0xf8838df8
>              0xf8800218
>               
> libjvm 
> .so 
> ` 
> __1cJJavaCallsLcall_helper6FpnJJavaValue_pnMmethodHandle_pnRJavaCallArguments_pnGThread__v_
>  
> +0x5a0
>              libjvm.so`JVM_DoPrivileged+0x500
>               
> libjava 
> .so 
> ` 
> Java_java_security_AccessController_doPrivileged__Ljava_security_PrivilegedExceptionAction_2
>  
> +0x14
>              0xf9467cc8
>              0xf9badd44
>             9852
>
>
> So my questions are:
>
> any ideas on what to do next? I'm pretty sure it's a jboss/ 
> application problem, but I need to get more data to show the jboss/ 
> devel people
> what's causing those dtrace errors?
>
>
> thanks in advance
>
>
>
> _______________________________________________
> dtrace-discuss mailing list
> dtrace-discuss@opensolaris.org


--
Adam Leventhal, Fishworks                        http://blogs.sun.com/ahl

_______________________________________________
dtrace-discuss mailing list
dtrace-discuss@opensolaris.org

Re: [dtrace-discuss] debugging a faulty jboss/application

Reply via email to