----- Original Message -----
Sent: Thursday, February 15, 2001 3:43
PM
Subject: runaway threads eating cpu
cycles on Solaris 7
We are running Tomcat 3.2.1 and Solaris 7 on
a Sun e250 with 4 400Mhz processors. The problem we're having is that one
thread is chewing up the majority of the cpu cycles and sometimes causes
tomcat to hang.
I have included sample mpstat data and the output
from ps -L -p PID:
ps -L -p 26361
PID
LWP TTY LTIME
CMD
26361 1
? 0:03
java
26361 22
? 1:02
java
26361 23 ?
40:57 java
26361 24
? 1:43
java
26361 26
? 0:09
java
26361 67
? 0:03 java
(24 entries
deleted for brevity. All were at 0:00 LTIME)
mpstat 30
CPU minf mjf xcal intr ithr csw icsw
migr smtx srw syscl usr sys wt idl
0
12 0 12 6
4 17 0 1
0 0 75 0
3 1 96
1 6
0 6 4
1 14 3 0
0 0 52 59
1 1 39
2 0
0 0 64 62
12 2 0
0 0 20 41
1 1 57
3 4
0 14 203 3
27 0 1
0 0 26 0
0 0 100
CPU minf mjf xcal intr ithr csw icsw migr
smtx srw syscl usr sys wt idl
0
0 0 1
3 2 16 0
1 1 0
51 1 3 1 95
1 0 0
0 5 1
6 4 0
0 0 3 81
0 0 19
2 0
0 2 19 19
17 0 1
0 0 16 0
0 0 100
3 4 0
13 202 2 15
1 1 0
0 41 19 2 0
78
CPU minf mjf xcal intr ithr csw icsw migr smtx srw
syscl usr sys wt idl
0 4
0 5 5
5 17 0 1
1 0 70 1
2 1 96
1 4
0 0 5
1 11 4 0
0 0 36 84
0 0 15
2 2
0 4 26 26
28 0 1
0 0 81 2
0 0 98
3 0
0 20 204 4
19 1 0
0 0 24 14
0 0 86
Before today, this was happening about every 3
days. Today it happened 5 hours apart. By going through our logs, we have
determined that this is not caused by any specific user action. It is also not
caused by server load, as it mostly happens with less than 5 users accessing
the application. It is also not a gradual thing. Our sar statistics show that
our processor idle time is 98% and then 5 minutes later it's down to 83% and
in another 5 minutes, it's at 49%.
Is there any way that I can tell exactly what is
happening in the offending thread?
Any other ideas on what's causing this
problem?
Thanks,
Kelly