Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Kris Kennaway [EMAIL PROTECTED]: On Tue, Jan 04, 2005 at 09:27:27PM -0500, Bruce Campbell wrote: I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) ... Try tuning the pwd_mkdb parameters (see hash(3)) in /usr/src/usr.sbin/pwd_mkdb/pwd_mkdb.c and recompile: HASHINFO openinfo = { 4096, /* bsize */ 32, /* ffactor */ 256,/* nelem */ 2048 * 1024,/* cachesize */ NULL, /* hash() */ 0 /* lorder */ }; e.g. adjust nelem to 12000 to accomodate your significantly-larger-than-average password database. If this helps, please submit a PR requesting that someone make an option to pwd_mkdb to tune this at runtime (or better yet, submit the patch to do this yourself - it's straightforward to modify the source to do this). Thanks. That had no effect on the large number of seeks/reads to do a getpwuid of a specific uid. I tried boosting that number further, still no change. I suspect the problem is related to some change to the hash functions between 4.7 and 5.2.1 and I hope to get to the bottom of it today. I tried two getpwnam (as opposed to getpwuid) calls on 2 different userids, one took 1000 seek/reads, the other 16,000, so it's all pretty random, no doubt related to how stuff gets hashed. On 4.7 it takes just one or two reads/seeks. As each login via ipop, imap, and each sendmail, and just about everything will be doing getpwnam's I think this is our problem. -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Bruce Campbell [EMAIL PROTECTED]: On Tue, Jan 04, 2005 at 09:27:27PM -0500, Bruce Campbell wrote: I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) It looks like the overhaul of getpwent Apr/2003 to make it thread safe: http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/gen/getpwent.c may be the problem. I've tested the dbm_fetch function independently on a large file, and it is fine. I've opened a bug report, and plan to build a replacement 4.x mail server, as the most deterministic path to restoring adequate e-mail service to our users. Can anyone suggest a workaround ? -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Bruce Campbell [EMAIL PROTECTED]: Quoting Bruce Campbell [EMAIL PROTECTED]: On Tue, Jan 04, 2005 at 09:27:27PM -0500, Bruce Campbell wrote: I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) It looks like the overhaul of getpwent Apr/2003 to make it thread safe: http://www.freebsd.org/cgi/cvsweb.cgi/src/lib/libc/gen/getpwent.c may be the problem. I've tested the dbm_fetch function independently on a large file, and it is fine. I've opened a bug report, and plan to build a replacement 4.x mail server, as the most deterministic path to restoring adequate e-mail service to our users. Can anyone suggest a workaround ? Well, somewhat unbelievably, copying a getpwent.c from 4.7 and remaking libc on 5.3 with it worked. Load average has gone from 70 to 2. And, so that this qualifies as a question... Am I crazy to pull an old getpwnam from 4.7 and blindly build it on 5.3 ? -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Bruce Campbell [EMAIL PROTECTED]: ... Well, somewhat unbelievably, copying a getpwent.c from 4.7 and remaking libc on 5.3 with it worked. Load average has gone from 70 to 2. One of my co-workers has found a less kludgey workaround for the high load problem we were seeing on 5.3 with large /etc/master.passwd, as follows: --- /etc/nsswitch.conf.old Wed Jan 5 19:23:24 2005 +++ /etc/nsswitch.conf Wed Jan 5 19:23:43 2005 @@ -1,7 +1,7 @@ -group: compat +group: files group_compat: nis hosts: files dns networks: files -passwd: compat +passwd: files passwd_compat: nis shells: files System is purring with load average under 1 now, 200,000 pop/imap sessions per day and 200,000 e-mails per day, all spamassassinated. For more details and ongoing followup, see: http://www.freebsd.org/cgi/query-pr.cgi?pr=75855 -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
New FreeBSD 5.3 e-mail server extremely slow...
We upgraded from a dual 1.66GHz AMD running FreeBSD 4.7 and a dual 3GHz Xeon running FreeBSD 5.3 and the new server is painfully slow, even after turning spamassassin and yavr (yet another virus recipe) off. Load appears to be imapd/ipop3d (uw-imapd) related. New server is Adaptec SCSI RAID, old one was 3ware ATA RAID, but disk load is relatively low anyway. It is a fairly high volume server, maybe 150,000 messages per day and 150,000 pop/imap sessions per day. But the old box was doing relatively fine. Turning off hyperthreading helped alot, but not enough. load average is around 48 now, I've set the 2 sendmail conf load av settings to 48 so at least e-mail gets in. A quick truss of an ipop3d process shows piles of this streaming by... setitimer(0,{0 0, 0 0},{0 0, 599 92})= 0 (0x0) write(1,0x805a000,21)= 21 (0x15) gettimeofday({1104857422 906783},0x0)= 0 (0x0) setitimer(0,{0 0, 600 0},{0 0, 0 0}) = 0 (0x0) read(0x0,0x8063000,0x832c) = 10 (0xa) setitimer(0,{0 0, 0 0},{0 0, 600 0}) = 0 (0x0) write(1,0x805a000,14)= 14 (0xe) gettimeofday({1104857422 908916},0x0)= 0 (0x0) setitimer(0,{0 0, 600 0},{0 0, 0 0}) = 0 (0x0) top shows 80-90% system activity. About to revert to our old box and maybe nfs mount /var/mail to make it less painless. Any suggestions ? -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow...
On Tue, Jan 04, 2005 at 12:38:48PM -0500, Bruce Campbell wrote: We upgraded from a dual 1.66GHz AMD running FreeBSD 4.7 and a dual 3GHz Xeon running FreeBSD 5.3 and the new server is painfully slow, even after turning spamassassin and yavr (yet another virus recipe) off. Load appears to be imapd/ipop3d (uw-imapd) related. Same version as you were running before? Same configuration files? Can you show us your kernel configuration and dmesg? Kris pgpUpAMPoKDD3.pgp Description: PGP signature
Re: New FreeBSD 5.3 e-mail server extremely slow...
Quoting Kris Kennaway [EMAIL PROTECTED]: On Tue, Jan 04, 2005 at 12:38:48PM -0500, Bruce Campbell wrote: We upgraded from a dual 1.66GHz AMD running FreeBSD 4.7 and a dual 3GHz Xeon running FreeBSD 5.3 and the new server is painfully slow, even after turning spamassassin and yavr (yet another virus recipe) off. Load appears to be imapd/ipop3d (uw-imapd) related. Same version as you were running before? Same configuration files? Well, no, not quite. old: imap-uw-2002_1,1 new: imap-uw-2004a,1 Just about all packages have undergone some updates on our new server. The only processes for which we have hundreds running would be sendmail, procmail, ipop3d and imapd. But, when I had the sendmail conf'ed to shutdown mail when load av went over 12, load av would still shoot up to 40 or 50 and stay there, and only major processes were imapd, ipop3d. And I noticed them calling setitimer alot, and 80% system usage. I'm about to pull the zero channel adaptec scsi raid card, for no other reason than I'm out of bright ideas. Can you show us your kernel configuration and dmesg? Kris old: (difference from 4.7 GENERIC) - cpu I386_CPU - cpu I486_CPU + optionsQUOTA #enable disk quotas + options SMP # Symmetric MultiProcessor Kernel + options APIC_IO # Symmetric (APIC) I/O new: (difference from 5.3 GENERIC) Reverted to non SMP for now, only difference from GENERIC is... options QUOTA I did have options SMP going for a while. Removing SMP has made no difference in load or responsiveness. Actually seems slightly better on one CPU. dmesg.boot from new system is as follows: Copyright (c) 1992-2004 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD 5.3-RELEASE #0: Thu Nov 25 15:48:15 EST 2004 [EMAIL PROTECTED]:/usr/src/sys/i386/compile/MAIL_SERVER Timecounter i8254 frequency 1193182 Hz quality 0 CPU: Intel(R) Xeon(TM) CPU 3.06GHz (3065.80-MHz 686-class CPU) Origin = GenuineIntel Id = 0xf27 Stepping = 7 Features=0xbfebfbffFPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMO V,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE Hyperthreading: 2 logical CPUs real memory = 2146959360 (2047 MB) avail memory = 2095419392 (1998 MB) ACPI APIC Table: PTLTD APIC ioapic0 Version 2.0 irqs 0-23 on motherboard ioapic1 Version 2.0 irqs 24-47 on motherboard ioapic2 Version 2.0 irqs 48-71 on motherboard npx0: [FAST] npx0: math processor on motherboard npx0: INT 16 interface acpi0: PTLTD RSDT on motherboard acpi0: Power Button (fixed) Timecounter ACPI-fast frequency 3579545 Hz quality 1000 acpi_timer0: 24-bit timer at 3.579545MHz port 0x1008-0x100b on acpi0 cpu0: ACPI CPU (2 Cx states) on acpi0 pcib0: ACPI Host-PCI bridge port 0xcf8-0xcff on acpi0 pci0: ACPI PCI bus on pcib0 pci0: unknown at device 0.1 (no driver attached) pcib1: ACPI PCI-PCI bridge at device 2.0 on pci0 pcib1: could not get PCI interrupt routing table for \\_SB_.PCI0.HLB_ - AE_NOT_FOU ND pci1: ACPI PCI bus on pcib1 pci1: base peripheral, interrupt controller at device 28.0 (no driver attached) pcib2: ACPI PCI-PCI bridge at device 29.0 on pci1 pci2: ACPI PCI bus on pcib2 em0: Intel(R) PRO/1000 Network Connection, Version - 1.7.35 port 0x3000-0x303f m em 0xf820-0xf821 irq 54 at device 3.0 on pci2 em0: Ethernet address: 00:30:48:29:c5:a8 em0: Speed:N/A Duplex:N/A em1: Intel(R) PRO/1000 Network Connection, Version - 1.7.35 port 0x3040-0x307f m em 0xf822-0xf823 irq 55 at device 3.1 on pci2 em1: Ethernet address: 00:30:48:29:c5:a9 em1: Speed:N/A Duplex:N/A pci1: base peripheral, interrupt controller at device 30.0 (no driver attached) pcib3: ACPI PCI-PCI bridge at device 31.0 on pci1 pci3: ACPI PCI bus on pcib3 asr0: Adaptec Caching SCSI RAID mem 0xfc00-0xfdff,0xfb00-0xfbff, 0xf830-0xf83f irq 30 at device 3.0 on pci3 asr0: [GIANT-LOCKED] asr0: ADAPTEC 2015S FW Rev. 3B05, 2 channel, 256 CCBs, Protocol I2O uhci0: Intel 82801CA/CAM (ICH3) USB controller USB-A port 0x2000-0x201f irq 16 a t device 29.0 on pci0 uhci0: [GIANT-LOCKED] usb0: Intel 82801CA/CAM (ICH3) USB controller USB-A on uhci0 usb0: USB revision 1.0 uhub0: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub0: 2 ports with 2 removable, self powered uhci1: Intel 82801CA/CAM (ICH3) USB controller USB-B port 0x2020-0x203f irq 19 a t device 29.1 on pci0 uhci1: [GIANT-LOCKED] usb1: Intel 82801CA/CAM (ICH3) USB controller USB-B on uhci1 usb1: USB revision 1.0 uhub1: Intel UHCI root hub, class 9/0, rev 1.00/1.00, addr 1 uhub1: 2 ports with 2 removable, self powered uhci2: Intel 82801CA/CAM (ICH3) USB controller USB-C port 0x2040-0x205f irq 18 a t device 29.2 on pci0 uhci2: [GIANT-LOCKED] usb2: Intel 82801CA/CAM (ICH3) USB controller USB-C on uhci2 usb2: USB
Re: New FreeBSD 5.3 e-mail server extremely slow...
On Jan 4, 2005, at 4:45 PM, Bruce Campbell wrote: The only processes for which we have hundreds running would be sendmail, procmail, ipop3d and imapd. I love procmail and would hate to live w/o it, but that would be my first suspect out of that list. TjL who once got a phone call from his ISP because of his .procmailrc (oops!) ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow...
On Tue, Jan 04, 2005 at 04:45:16PM -0500, Bruce Campbell wrote: Quoting Kris Kennaway [EMAIL PROTECTED]: On Tue, Jan 04, 2005 at 12:38:48PM -0500, Bruce Campbell wrote: We upgraded from a dual 1.66GHz AMD running FreeBSD 4.7 and a dual 3GHz Xeon running FreeBSD 5.3 and the new server is painfully slow, even after turning spamassassin and yavr (yet another virus recipe) off. Load appears to be imapd/ipop3d (uw-imapd) related. Same version as you were running before? Same configuration files? Well, no, not quite. old: imap-uw-2002_1,1 new: imap-uw-2004a,1 OK, that's where you should start, then. Go back to the software configuration that you know is working and see if it still misbehaves. Kris pgp4Ak9FEHcLW.pgp Description: PGP signature
Re: New FreeBSD 5.3 e-mail server extremely slow...
On Jan 4 at 16:58, Timothy Luoma launched this into the bitstream: On Jan 4, 2005, at 4:45 PM, Bruce Campbell wrote: The only processes for which we have hundreds running would be sendmail, procmail, ipop3d and imapd. I love procmail and would hate to live w/o it, but that would be my first suspect out of that list. TjL who once got a phone call from his ISP because of his .procmailrc (oops!) You too eh? :-) -Colin (who just *knows* it was _that_ special recipe on the procmail list that did us both in on the same day!!) Sorta like a digital Montezuma's revenge [shudder] sorry, couldn't resist. ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
Quoting Kris Kennaway [EMAIL PROTECTED]: Well, no, not quite. old: imap-uw-2002_1,1 new: imap-uw-2004a,1 OK, that's where you should start, then. Go back to the software configuration that you know is working and see if it still misbehaves. Kris Thanks. I shutdown imapd/ipop3d completely so I just had sendmail running, and still load av. was 20-30. Anyways, I have just found something very odd with both 5.2.1 and 5.3 on multiple different systems here, including a brand new GENERIC install. On 5.x, ls -l or ps waux is very slow with our /etc/master.passwd which has 11320 entries. I truss'ed those commands, and gave up after watching : lseek(4,0x17d000,SEEK_SET) = 1560576 (0x17d000) read(0x4,0x8074000,0x1000) = 4096 (0x1000) lseek(4,0x17e000,SEEK_SET) = 1564672 (0x17e000) read(0x4,0x8062000,0x1000) = 4096 (0x1000) lseek(4,0x17f000,SEEK_SET) = 1568768 (0x17f000) read(0x4,0x8066000,0x1000) = 4096 (0x1000) lseek(4,0x18,SEEK_SET) = 1572864 (0x18) scroll by for 10 minutes. (handle 4 = /etc/spwd.db) I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) 4.7 (with same master.passwd file) gave 59 lines of output, which seems normal. I'm speculating that imap and sendmail and just about everything use getpwuid and getpwuid is misbehaving on 5.x especially with a large master.passwd file. I will report this through the proper mechanism once I do just a bit more testing. And perhaps it is a known issue already and I'll look into that also. Or perhaps I have messed something up unwittingly, which I have been known to do. We do have an extremely busy 5.2.1 system running here fine on the same hardware, just it has a small /etc/master.passwd which may explain that systems success to date. Thank you to everyone who sent suggestions. -- Bruce Campbell Engineering Computing CPH-2374B University of Waterloo (519)888-4567 ext 5889 This mail sent through www.mywaterloo.ca ___ freebsd-questions@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-questions To unsubscribe, send any mail to [EMAIL PROTECTED]
Re: New FreeBSD 5.3 e-mail server extremely slow - traced to getpwnam maybe ?
On Tue, Jan 04, 2005 at 09:27:27PM -0500, Bruce Campbell wrote: I wrote a small program: #include sys/types.h #include pwd.h main( int argc, char *argv[] ) { getpwuid( 13076 ); } and ran it under truss on 5.x and it generated 178,711 lines of output. (the bulk of which is those lseek/read calls as above) 4.7 (with same master.passwd file) gave 59 lines of output, which seems normal. I'm speculating that imap and sendmail and just about everything use getpwuid and getpwuid is misbehaving on 5.x especially with a large master.passwd file. Try tuning the pwd_mkdb parameters (see hash(3)) in /usr/src/usr.sbin/pwd_mkdb/pwd_mkdb.c and recompile: HASHINFO openinfo = { 4096, /* bsize */ 32, /* ffactor */ 256,/* nelem */ 2048 * 1024,/* cachesize */ NULL, /* hash() */ 0 /* lorder */ }; e.g. adjust nelem to 12000 to accomodate your significantly-larger-than-average password database. If this helps, please submit a PR requesting that someone make an option to pwd_mkdb to tune this at runtime (or better yet, submit the patch to do this yourself - it's straightforward to modify the source to do this). Kris pgpAjUrFD81hG.pgp Description: PGP signature