Hello,
I have am having a problem which I am fairly sure I have traced to
running 6.X based code on 7.2 with apache such as Wusage and Scomtherm.
All runs fine on the server without these running and when I start the
program the system spirals downwards only on http from what I can tell.
It
I actually find that running Wusage 8.0 a few times even with nice-19
may be implicated in getting the system to spiral downwards. I hesitate
to mention this as it seems to be working fine on another 7.X server. I
believe that Wusage is tied to 6.X libraries and I wonder if somehow
this may
Is the high load average simply a function of processes blocking on
network io ? On our av/spam scanners for example show a high load avg
because there are many processes waiting on network io to complete
(e.g. talking to RBL lists, waiting for DCC servers to complete etc)
Also, is it
last pid: 46013; load averages: 105.30, 67.67,
34.45 up 4+23:59:42 19:08:40
629 processes: 89 running, 540 sleeping
CPU: 21.9% user, 0.0% nice, 74.5% system, 3.1% interrupt, 0.4% idle
Mem: 1538M Active, 11G Inact, 898M Wired, 303M Cache,
Just to confirm we see something similar on the box which runs our stats.
We have updated from 5.4 - 6.0 - 6.2 - 7.0 all have had no effect on
the lockups which happen when the stats run.
This box is also on an areca controller but it was on an Adaptec and we
saw pretty much the same thing
last pid: 46013; load averages: 105.30, 67.67,
34.45 up 4+23:59:42 19:08:40
629 processes: 89 running, 540 sleeping
CPU: 21.9% user, 0.0% nice, 74.5% system, 3.1% interrupt, 0.4% idle
Mem: 1538M Active, 11G Inact, 898M Wired, 303M Cache, 214M
[ns8]# vmstat -i
interrupt total rate
irq4: sio0 57065 0
irq17: em13989494045554
irq18: arcmsr0 558098657 77
cpu0: timer 14381393929 2000
[ns8]# vmstat -i
interrupt total rate
irq4: sio0 57065 0
irq17: em13989494045554
irq18: arcmsr0 558098657 77
cpu0: timer 14381393929 2000
The next thing I am doing is going to be removing the QUOTA feature
to see if this has any bearing
on this problem. It does not appear to be even writing at a heavy
load as you can see (almost
nothing) but the processes are mostly in UFS when it spirals out of
control.
Whats strange is
What does top -S show ? Most of the load is in system. Does the
machine in question have a rather large master.passwd file by chance ?
(http://www.freebsd.org/cgi/query-pr.cgi?pr=75855)
---Mike
Thanks for your quick reply:
master.passwd is only 9467 (with a ls-l)
TOP -ISM at times
Replying to my own post ...
I have done a test on the same machine comparing 6.3-p1 to 7.1-PRE.
The performance is the expected ~6MB/s (because of the lack of cache)
on 6.3-p1, so the BIOS change doesn't seem to be at fault.
This seems to be a regression somewhere between 6.3 to 7.1. The
I would try the change to /etc/nsswitch.conf so that group and passwd
read
group: files
passwd: files
At that file size, it sounds like you only have about 200 entries ? I
doubt its the issue, but its worth a try. I know at around 9,000
files anything to do with UID lookups (e.g. ls
I would try the change to /etc/nsswitch.conf so that group and passwd
read
group: files
passwd: files
At that file size, it sounds like you only have about 200 entries ? I
doubt its the issue, but its worth a try. I know at around 9,000
files anything to do with UID lookups (e.g. ls -l)
I would also try disabling polling. Is you scheduler ULE or BSD? For
an 8 core box, it should be ULE
---Mike
Hi Mike,
Thanks I will try this now as I have not tried this yet.
Here is the current custom kernel and it is using ULE:
cpu HAMMER
ident MYCOMPUTER
14 matches
Mail list logo