Re: High Interrupt Load cased by pciide with sparc64 on SUN V210

2007-02-04 Thread Rolf Sommerhalder

The high interrupt load vanished after removing the CD-ROM drives from
both V210, as suggested by Mark Kettenis.

Now the CPU load is down to 0%, as one expects, and the systems are
much more performant and responsive than before :-)

# iostat -w 1
 ttycd0 sd0 cpu
tin tout  KB/t t/s MB/s   KB/t t/s MB/s  us ni sy in id
  0   21  0.00   0 0.00   8.18   2 0.01   3  0  0 36 61
  0  172  0.00   0 0.00   0.00   0 0.00   0  0  0  0100
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0  0100
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0  0100
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0  0100
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0  0100
  0   57  0.00   0 0.00  16.00   2 0.03   0  0  0  0100
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0  0100
^C
#

Thanks to Mark for his suggestion,
Rolf



High Interrupt Load cased by pciide with sparc64 on SUN V210

2007-01-31 Thread Rolf Sommerhalder

Hello misc,

After having installed 4.0-current on two identically configured SUN
V210 (see dmesg below), I found that their performance was unusually
bad, notably with disk I/O.

top reveals a permanent interrupt load of between 30 to over 50% !?

# top
load averages:  0.09,  0.17,  0.08
19:52:16
13 processes:  12 idle, 1 on processor
CPU states:  1.9% user,  0.0% nice,  1.6% system, 39.0% interrupt, 57.5% idle
Memory: Real: 11M/122M act/tot  Free: 880M  Swap: 0K/487M used/tot

 PID USERNAME PRI NICE  SIZE   RES STATEWAIT TIMECPU COMMAND
1643 root   20  832K 2360K idle select   0:02  0.00% sshd
30242 root   20 3544K 3504K sleepselect   0:01  0.00% sshd
24667 root   20  480K 1072K idle select   0:00  0.00% inetd
2435 root   20 1592K 2136K sleepselect   0:00  0.00% sendmail
20073 root  180  824K  616K sleeppause0:00  0.00% ksh
24745 root   20  536K 1056K idle poll 0:00  0.00% ntpd
29783 root   20  664K 1184K sleepselect   0:00  0.00% cron
25532 root   30  352K 1104K idle ttyin0:00  0.00% getty
21605 _syslogd   20  544K 1000K idle poll 0:00  0.00% syslogd
   1 root  100  536K  424K idle wait 0:00  0.00% init
24060 root   20  520K  976K idle netio0:00  0.00% syslogd
3809 _ntp   20  408K 1056K idle poll 0:00  0.00% ntpd
24849 root  310  504K 1648K onproc   -0:00  0.00% top


vmstat hints that pciide0 generates interrupts with a very high rate,
although there is hardly anything running on the boxes:

# vmstat -i
interrupt   total rate
bge0  5143
com0   820
pciide0 134564203   879504
siop02158   14
siop1   10
clock   15327  100
Total   134582285   879622
# uptime
7:50PM  up 3 mins, 1 user, load averages: 0.27, 0.21, 0.09
#

# iostat -w 1
 ttycd0 sd0 cpu
tin tout  KB/t t/s MB/s   KB/t t/s MB/s  us ni sy in id
  0   29  0.00   0 0.00  15.09   1 0.01   0  0  0 36 64
  0  171  0.00   0 0.00   0.00   0 0.00   0  0  0 38 62
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 33 67
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 38 62
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 28 72
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 34 66
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 27 73
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 40 60
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 31 69
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 36 64
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 29 71
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 31 69
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 34 66
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 37 63
  0   57  0.00   0 0.00   0.00   0 0.00   0  0  0 32 68

Maybe I shoud mention that I was lazy and installed everything under /
into a single partition because it's only a lab setup.
Also, in both boxes, I installed a SK-9S91 PCI 1 Gbit/s fiber NIC, in
addition to the four on-board bge NICs.

Rebooting multiple times of both boxes did not cure the high interrupt
rate, and enabling softupdates on /dev/sd0a did not help either.

The third-lat line in the output of dmesg below is somewhat intrigueing:
No counter-timer -- using %tick at 1336MHz as system clock. root on sd0a
But according to other dmesg from the archives, this seems to be
common among sparc64 installs.

In the archives, I found postings regarding a similar problem on an
Ultra 5. Apparently, the only recommendation there was to enable
softupdates. I remember having observed similar performance problems
on my own Ultra 5 some months ago under OpenBSD 3.9, which I was
unable to resolve.

Shall I try a re-install with more partitions, respectively to add
more partitions for the usual mount points?
Anything to do from with OpenBoot in order to avoid interrupt
conflicts between pciide0 and the SK-9S91 PCI NIC, for example?
Is there anything else that I should try in order to silence that
interrupt source?
I am happy to rebuild the kernel after patching and to re-test.

Thanks for your attention and any suggestions,
Rolf


# disklabel sd0
# /dev/rsd0c:
type: SCSI
disk: SCSI disk
label: SUN72G cyl 14087
flags:
bytes/sector: 512
sectors/track: 424
tracks/cylinder: 24
sectors/cylinder: 10176
cylinders: 14087
total sectors: 143349312
rpm: 10025
interleave: 1
trackskew: 0
cylinderskew: 0
headswitch: 0   # microseconds
track-to-track seek: 0  # microseconds
drivedata: 0

16 partitions:
# sizeoffset  fstype [fsize bsize  cpg]
 a:  7704 0  4.2BSD   2048 16384   16 # Cyl 0 -  6878
 b:997248  7704swap   # Cyl  6879 -  6976
 c: