Hello misc,
After having installed 4.0-current on two identically configured SUN
V210 (see dmesg below), I found that their performance was unusually
bad, notably with disk I/O.
top reveals a permanent interrupt load of between 30 to over 50% !?
# top
load averages: 0.09, 0.17, 0.08
19:52:16
13 processes: 12 idle, 1 on processor
CPU states: 1.9% user, 0.0% nice, 1.6% system, 39.0% interrupt, 57.5% idle
Memory: Real: 11M/122M act/tot Free: 880M Swap: 0K/487M used/tot
PID USERNAME PRI NICE SIZE RES STATEWAIT TIMECPU COMMAND
1643 root 20 832K 2360K idle select 0:02 0.00% sshd
30242 root 20 3544K 3504K sleepselect 0:01 0.00% sshd
24667 root 20 480K 1072K idle select 0:00 0.00% inetd
2435 root 20 1592K 2136K sleepselect 0:00 0.00% sendmail
20073 root 180 824K 616K sleeppause0:00 0.00% ksh
24745 root 20 536K 1056K idle poll 0:00 0.00% ntpd
29783 root 20 664K 1184K sleepselect 0:00 0.00% cron
25532 root 30 352K 1104K idle ttyin0:00 0.00% getty
21605 _syslogd 20 544K 1000K idle poll 0:00 0.00% syslogd
1 root 100 536K 424K idle wait 0:00 0.00% init
24060 root 20 520K 976K idle netio0:00 0.00% syslogd
3809 _ntp 20 408K 1056K idle poll 0:00 0.00% ntpd
24849 root 310 504K 1648K onproc -0:00 0.00% top
vmstat hints that pciide0 generates interrupts with a very high rate,
although there is hardly anything running on the boxes:
# vmstat -i
interrupt total rate
bge0 5143
com0 820
pciide0 134564203 879504
siop02158 14
siop1 10
clock 15327 100
Total 134582285 879622
# uptime
7:50PM up 3 mins, 1 user, load averages: 0.27, 0.21, 0.09
#
# iostat -w 1
ttycd0 sd0 cpu
tin tout KB/t t/s MB/s KB/t t/s MB/s us ni sy in id
0 29 0.00 0 0.00 15.09 1 0.01 0 0 0 36 64
0 171 0.00 0 0.00 0.00 0 0.00 0 0 0 38 62
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 33 67
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 38 62
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 28 72
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 34 66
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 27 73
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 40 60
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 31 69
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 36 64
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 29 71
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 31 69
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 34 66
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 37 63
0 57 0.00 0 0.00 0.00 0 0.00 0 0 0 32 68
Maybe I shoud mention that I was lazy and installed everything under /
into a single partition because it's only a lab setup.
Also, in both boxes, I installed a SK-9S91 PCI 1 Gbit/s fiber NIC, in
addition to the four on-board bge NICs.
Rebooting multiple times of both boxes did not cure the high interrupt
rate, and enabling softupdates on /dev/sd0a did not help either.
The third-lat line in the output of dmesg below is somewhat intrigueing:
No counter-timer -- using %tick at 1336MHz as system clock. root on sd0a
But according to other dmesg from the archives, this seems to be
common among sparc64 installs.
In the archives, I found postings regarding a similar problem on an
Ultra 5. Apparently, the only recommendation there was to enable
softupdates. I remember having observed similar performance problems
on my own Ultra 5 some months ago under OpenBSD 3.9, which I was
unable to resolve.
Shall I try a re-install with more partitions, respectively to add
more partitions for the usual mount points?
Anything to do from with OpenBoot in order to avoid interrupt
conflicts between pciide0 and the SK-9S91 PCI NIC, for example?
Is there anything else that I should try in order to silence that
interrupt source?
I am happy to rebuild the kernel after patching and to re-test.
Thanks for your attention and any suggestions,
Rolf
# disklabel sd0
# /dev/rsd0c:
type: SCSI
disk: SCSI disk
label: SUN72G cyl 14087
flags:
bytes/sector: 512
sectors/track: 424
tracks/cylinder: 24
sectors/cylinder: 10176
cylinders: 14087
total sectors: 143349312
rpm: 10025
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0 # microseconds
track-to-track seek: 0 # microseconds
drivedata: 0
16 partitions:
# sizeoffset fstype [fsize bsize cpg]
a: 7704 0 4.2BSD 2048 16384 16 # Cyl 0 - 6878
b:997248 7704swap # Cyl 6879 - 6976
c: