I've got a very large web application (about 300 objects and about 1000
pages) which uses mostly straight JSP.  This gets a reasonable number of
hits with approximately 200 concurrant sessions operating.  

Recently, we introduced something thats causing something resembling a
thread deadlock.  Some unknown event occurs, then things start grinding to
halt as threads get backed up.  When this happens they only way to get out
is to hard kill the server (e.g. - orions shutdown doesn't work, and kill
-TERM doesn't work).

This only really occurs under load, and we cannot reproduce it in a
development environment (even with loading tools).  We've crawled through
every line of code carefully and have found some obscure race-conditions we
hadn't considered (race conditions we never actually had occur).  But so far
nothing that has would fix our real problem, so I'm fairly convinced that
I'm not going to find it easily by looking at java code.

Now I've tried jdb and of course I can only see suspended threads (which is
not too useful) and I've tried jprobe but that only shows the parent threads
state.  I even tried strace/truss but thats too lowlevel to make out whats
happening.  I'm starting to use 'kill -3' but that again only shows the
parent thread.  

Does anyone have an suggestions on doing runtime debug on the thread level?
I'd really just like to see whats actually happening in the locked threads.
Anyone?

 

Reply via email to