Hi list.
I have a problem with a P4 (hyper-threaded) powered server. It
constantly has a load average of 2.something, while looking with top I
don't see any process actually taking all that CPU resource.
Here's a snippet of /proc/cpuinfo:
vendor_id : GenuineIntel
cpu family : 15
model : 3
model name : Intel(R) Pentium(R) 4 CPU 3.00GHz
stepping : 4
cpu MHz : 2993.807
cache size : 1024 KB
(I have two of those listed, of course)
And here's a sample of top's output (sorted by CPU usage):
top - 10:36:47 up 84 days, 1:27, 3 users, load average: 2.01, 2.02,
1.97
Tasks: 73 total, 1 running, 65 sleeping, 0 stopped, 7 zombie
Cpu(s): 100.0% us, 0.0% sy, 0.0% ni, 0.0% id, 0.0% wa, 0.0% hi,
0.0% si
Mem: 1032328k total, 1010516k used, 21812k free, 146464k buffers
Swap: 1036152k total, 14944k used, 1021208k free, 673464k cached
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
445 root 16 0 0 0 0 S 0.3 0.0 1:17.11 kjournald
964 root 16 0 2172 1044 1964 R 0.3 0.1 0:00.19 top
1 root 16 0 1580 508 1424 S 0.0 0.0 0:02.47 init
2 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/0
3 root 34 19 0 0 0 S 0.0 0.0 0:00.10 ksoftirqd/0
4 root RT 0 0 0 0 S 0.0 0.0 0:00.00 migration/1
5 root 34 19 0 0 0 S 0.0 0.0 0:00.05 ksoftirqd/1
6 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 events/0
7 root 5 -10 0 0 0 S 0.0 0.0 0:00.00 events/1
[lots of other kernel processes and then some user processes]
As you can see, the userspace load is 100% which I assume means 100% of
all processor resources in the system (which accounts for the 2.0 load
average as we have virtually 2 of those), but no process listed by top
actually takes any significant amount of CPU time. Listing by time I
get kjournald with almost 6 hours of CPU time (over 84 days - had a
power failure about 3 months back), then sshd (the server is headless)
then everyone else has less then 10 minutes.
The server is mostly used to ran Nagios monitor and some Java daemons.
Tomcat is running taking about 1.2GB of virtual, which is about 60% of
all memory, but it sees absolutely no usage and uses less then 5% real
memory. two other java services and MySQL together grab another 600MB
of virtual and everything else is mostly scripts and use negliable
amounts of VIRT, RES and CPU.
Another weird thing is that a quick calculation would have the VIRT
usage of the system very close to the total memory available (1GB
physical + 1GB swap), yet the top output above shows more then half of
memory to be available(!).
At this point I'm clueless and would appreciate if anyone has any idea
that might explain these figures, or stuff that I can try on the server
(its a production server, so don't try to be funny :-).
My only other option currently is to init1;init 3, which all things
considered I'm loath to do although the Nagios warnings are getting
quite annoying.
--
Oded
::..
The most wasted of all days is one without laughter.
-- e.e. cummings
=================================================================
To unsubscribe, send mail to [EMAIL PROTECTED] with
the word "unsubscribe" in the message body, e.g., run the command
echo unsubscribe | mail [EMAIL PROTECTED]