Hi

I'm running tomcat 5.5.25 on debian sarge 32 bit linux (2.6.8 kernel)
with ~ 1.5GB ram on a pentium 4 2GHz
with a mysql db 5.0.27

I've got a configuration with apache mod_jk 1.2.25  balancing to 2
tomcats which are both running running on jdk 1.6.0_16  -Xmx=256M

periodically, generally at busy times, looking at the "JK Status
Manager" the busy count on one of the tomcats will go up and the
requests channelled through that container will start to hang, the
busy count will steadily increase the throughput will drop
dramatically  (i.e. the Acc column in jk_status will stop incrementing
by 30 every 10secs and go down to like 4),

This will continue until I either stop that tomcat member through the
JK Status Manager - by editing the worker settings or the thread count
goes up to over the number of permitted apache requests (at 150 at the
moment) and apache is restarted automatically by an out of process
monitoring app.

If I stop the tomcat instance through the JK Status Manager, then the
busy count will gradually (over a period of 5 - 10 mins) decrease and
get to 0.

I took a thread dump by tee-ing the output of catalina.out and then
sending a "kill -SIGQUIT pid" when it was in the described busy state
and lots of threads

The crux of that seemed to be a lot of threads blocked / waiting for
lock on a lock / monitor held in the HandlerRequest.checkRequest here
is the printout of the thread holding the lock which is in the
RUNNABLE state:



"TP-Processor65" daemon prio=10 tid=0x08bc9400 nid=0x54bd runnable [0x55dd0000]
   java.lang.Thread.State: RUNNABLE
   at java.lang.Class.getMethod(Class.java:1605)
   at 
org.apache.commons.modeler.BaseModelMBean.setManagedResource(BaseModelMBean.java:764)
   at org.apache.commons.modeler.ManagedBean.createMBean(ManagedBean.java:393)
   at org.apache.commons.modeler.Registry.registerComponent(Registry.java:835)
   at org.apache.jk.common.ChannelSocket.registerRequest(ChannelSocket.java:466)
   at org.apache.jk.common.HandlerRequest.checkRequest(HandlerRequest.java:357)
   - locked <0x4490ee38> (a java.lang.Object)
   at org.apache.jk.common.HandlerRequest.decodeRequest(HandlerRequest.java:367)
   at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:261)
   at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:773)
   at 
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703)
   at 
org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:895)
   at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
   at java.lang.Thread.run(Thread.java:619)

then lots of the following types of threads (e.g. 35) all blocked

"TP-Processor63" daemon prio=10 tid=0x09ddc800 nid=0x549f waiting for
monitor entry [0x55d30000]
   java.lang.Thread.State: BLOCKED (on object monitor)
   at org.apache.jk.common.HandlerRequest.checkRequest(HandlerRequest.java:357)
   - waiting to lock <0x4490ee38> (a java.lang.Object)
   at org.apache.jk.common.HandlerRequest.decodeRequest(HandlerRequest.java:367)
   at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:261)
   at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:773)
   at 
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703)
   at 
org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:895)
   at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
   at java.lang.Thread.run(Thread.java:619)

"TP-Processor62" daemon prio=10 tid=0x09dd4c00 nid=0x549e waiting for
monitor entry [0x55ce0000]
   java.lang.Thread.State: BLOCKED (on object monitor)
   at org.apache.jk.common.HandlerRequest.checkRequest(HandlerRequest.java:357)
   - waiting to lock <0x4490ee38> (a java.lang.Object)
   at org.apache.jk.common.HandlerRequest.decodeRequest(HandlerRequest.java:367)
   at org.apache.jk.common.HandlerRequest.invoke(HandlerRequest.java:261)
   at org.apache.jk.common.ChannelSocket.invoke(ChannelSocket.java:773)
   at 
org.apache.jk.common.ChannelSocket.processConnection(ChannelSocket.java:703)
   at 
org.apache.jk.common.ChannelSocket$SocketConnection.runIt(ChannelSocket.java:895)
   at 
org.apache.tomcat.util.threads.ThreadPool$ControlRunnable.run(ThreadPool.java:689)
   at java.lang.Thread.run(Thread.java:619)

.....etc....

here is a typical reading from the JK Server Manager for the  for the
balanced members
Name    tomcatA
Type    ajp13
Host    localhost:8011
Addr    127.0.0.1:8011
Act     ACT
State   OK
D       0
F       100
M       1
V       104
Acc     42540
Err     0
CE      426
RE      0
Wr      34M
Rd      792M
Busy    80
Max     92
Route   tomcatA
RR      
Cd      
Cd      0/0

Name    tomcatB
Type    ajp13
Host    localhost:8012
Addr    127.0.0.1:8012
Act     ACT
State   OK
D       0
F       100
M       1
V       97
Acc     42719
Err     0
CE      377
RE      0
Wr      39M
Rd      807M
Busy    4
Max     57
Route   tomcatB
RR      
Cd      
Cd      0/0


I'm been trying to diagnose what the problem is for about a week,
running with one of the tomcats with a remove monitoring software
(yourkit - its excellent by the way), I've already pinpointed and
remove some other issues, (locking issues with log4j logging -
switched to asynchronous appender, locking issues with dbcp / pool -
switched to c3p0, and a small memory leak) now the only problem is the
one described above.  Incidentally the tomcat instance which is
running yourkit is more prone to locking at the moment

I feel that it could be something to do with that under certain
circumstances or for certain requests (not sure which ones) that
checkRequest takes a *long* time to return.  The web app is a property
multiple listings site so people are uploading fotos all the time I'm
wondering if its something to do with that, I handle these with the
org.apache.commons.fileupload classes).

I think that the box is bit overloaded between them the instances are
handling around 220,000 requests a day on a busy day, uploading and
resizing around ~ 1000 images.
Beyond that I can't see why it should be hanging, recently we've
started to get around 10% more traffic, which has made the problem
worse.

If anyone has any ideas or requires any more information to help me
diagnose the problem.
Also if anyone knows what the HandlerRequest.checkRequest method does
and why it is there I'd be interested to know.

Cheers
Simon

---------------------------------------------------------------------
To unsubscribe, e-mail: users-unsubscr...@tomcat.apache.org
For additional commands, e-mail: users-h...@tomcat.apache.org

Reply via email to