I am having a problem with high CPU utilization by my Tomcat (java)
process.
First some background
Sun V880
Solaris 9
java.runtime.name = Java(TM) 2 Runtime Environment, Standard Edition
java.vm.version = 1.4.1_06-b01
Tomcat 4.1.27
MySQL 3.23.53
Homegrown object cache (eliminates trips to database)
DBCP
We are running the 64 bit extensions for java.
Application is a viewer of reports and is used to upload new reports.
Report sizes are anywhere from 2k to 800 MB. Users can only view at
most
1500 pages (~10MB) at any one time.
400 total users. 50 fairly active users. maybe 4 concurrent tops
Client in almost all cases is IE 6 using https over a VPN
CPU utilization for java is usually < 2%
Here is a top when java is out of control
load averages: 2.42, 2.43, 2.39 hou-ftp
09:45:50
88 processes: 85 sleeping, 1 stopped, 2 on cpu
CPU states: 51.9% idle, 31.8% user, 16.2% kernel, 0.0% iowait, 0.0%
swap
Memory: 8.0G real, 5.8G free, 706M swap in use, 6.8G swap free
PID USERNAME THR PR NCE SIZE RES STATE TIME FLTS CPU COMMAND
29028 root 32 0 0 365M 325M cpu02 17.1H 1 26.27% java
15313 root 1 20 0 1872K 1416K sleep 9:47 0 2.98%
resend.sh
24938 root 1 59 0 1912K 1456K sleep 9:50 0 0.05%
nocmon.20040406
7123 mholly 1 29 10 2680K 1720K cpu06 1:09 0 0.04% top
5339 root 1 59 0 1904K 1432K sleep 19:02 0 0.02%
get_pushes.sh
1 root 1 59 0 1304K 528K sleep 91:03 0 0.01% init
16772 root 1 59 0 1992K 1520K sleep 19:19 0 0.01%
postmaster.sh
29503 root 1 59 0 4512K 2920K sleep 0:00 0 0.01% sshd
776 root 1 59 0 952K 720K sleep 0:00 0 0.01% sleep
702 root 1 59 0 952K 720K sleep 0:00 0 0.01% sleep
29429 root 1 59 0 952K 720K sleep 0:00 0 0.01% sleep
16785 root 1 59 0 1912K 1440K sleep 10:32 0 0.00%
distribution.sh
11717 root 1 60 0 9488K 7848K sleep 0:23 0 0.00% arkvlib
229 root 1 59 0 1872K 1416K sleep 0:00 0 0.00% ksh
7753 root 1 59 0 4304K 3192K sleep 37.9H 0 0.00%
burst_controlle
My SA ran truss when the box was under distress, however, I do not
understand
what I am looking at. He says it is polling something.
It appears this problem occurs after a user performs an upload to the
system.
We have had problems with IE 6 and poor way it manages authentication,
however,
we seem to have a work around for this.
Not all users induce this load when they upload. After the event in
question
CPU bumps from < 2% to ~25% (1 CPU taken up) and stays there. If
another event
occurs then the CPU goes to ~50% (2 CPUs) etc.
I think the event occurs when a user attempts and upload and it appears
to
be hung so they click the upload button again. What
What options do I have for instrumenting this and understanding what is
going on?
My SA wants to know if there are any switches we can apply to help us
understand
what is happening at a java level.
Obviously, we are looking at upgrading the JVM. Any other suggestions?
Thanks
Michael Holly
Tailisen Technologies