Steve Shockley wrote: > A few days ago, I had an old Windows box that worked as an inbound mail > relay start to fail, so I figured I'd replace it with two OpenBSD boxes > in a CARP pool. > > It's a big VMware shop, and I've mostly had good luck running OpenBSD > under ESX, so I set up two 4.6 amd64 VMs and put them into production. > > The site gets about 30-40k messages per day. During periods of heavy > load, the load average would occasionally spike over 12, and Sendmail > would dutifully stop accepting new mail until it slowed down. > Unfortunately, since this box is just a relay (second hop inbound), that > meant the first hop inbound would start to just queue messages since the > second hop wasn't responding.
oh, I'm painfully familiar with that little issue. :-/ However your load seems very modest. > I figured maybe it was a VMware problem, so I cranked the load average > threshold to 50 to work around the problem, and built a physical box (an > HP DL360 G4, 2gb RAM). This was also the day I got my 4.7 CDs, so I > installed 4.7. > > The physical box is doing much better, but the load averages are still > much higher than I'd expect, generally never going below 1. I realize > load averages are usually lies, but the box seems to be working a lot > harder than I'd expect. For reference, the Windows box I replaced was a > DL380 G2, with a single P3-1.4 and 256mb RAM, and it was running a > commercial antivirus product based on Sendmail. > > What can I do to diagnose the performance bottleneck? The CPU is mostly > idle. Look at the blinky lights on the hard disks? I know, macho admins love to look at magical system parameters, but I usually solve such problems by looking at the disk activity lights (and why I dislike Sun and Macintosh systems). I suspect you are i/o bound. (ok, that's not my most clever diagnosis of the day...) * softdeps? (I know, a few people hyperventilate over softdeps on mail servers, but really...if your mail server crashes enough to worry about uncommitted-to-disk messages, you probably have issues much bigger than softdeps. This is mostly an academic issue; in real life, you will lose more mail through your mail filters than you will through crashes and softdeps if your servers are at all reliable). * cache active on the RAID card? no cache=sucky performance! * Good RAID config? (raid1+0 rocks. raid5 just pretends to, until you lose a disk, raid1 trades some write performance for redundancy). * could you have a degraded RAID set? (i.e., if you have RAID5, but blew out a disk that you didn't notice) * Some Compaq/HP RAID systems don't perform overly well on OpenBSD (not sure about other OSs). Nick.

