Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
What we're seeing is that our network and RAID 5 IDE-based disk array on our central mail store server is not able to keep up with the 'client' servers doing the POP3, IMAP, Webmail, and SMTP legwork. I've found an interesting bottleneck with webmail. When people use POP or IMAP clients (Outlook, Mozilla, Opera, Thunderbird, etc.), the client application caches alot of the information locally and synchronizes occasionally with the server to see if there are new messages. Things like browsing and searching run eally fast because the user is utilizing the resources of their local PC to do most of teh work. With webmail, the session state is not saved nor cached, so with each new request, the mailbox can be rescanned. A relatively modest webmail application might only rescan all headers and show subject lines. A complex application might scan all content in a folder to present content more fully. Without anything to throttle back the webmail server, it's possible that the webmail server softwar can pound the mail spool server to death. I used to run a Qmail-based infrastructure for 4000 clients on a single slow machine without much memory. They used POP as their only pickup mechanism. We recently reimplemented on a Dell 1750 with two Xeon procs, alot of RAM and a GigE backend to a NetApp filer with 14 fast disks, and I STILL notice that the machine sometimes slowed down while people tried to read their 140MB mailboxes via webmail. sigh I put some bottlenecks on the search and retrieval algorithms of the webmail software to help protect the filer from a flood of queries, and we've been better since then. The power users with super-large mailboxes complain that it's slow, but now its a localized problem rather than a problem that affects everyone. Jeremy's comments are great for scaling the database, but it sounds to me that you're just maxed out on what you can serve over NFS. An SQL select might take at most a few kilobytes of data on the network whereas a webmail scan of a 30MB mailbox will take, well, 30MB. Doh! So what to do? Instead of the centralized NFS mail spool (where the central spool becomes the bottleneck), you might consider splitting the user base across several machines. Each machine would have its own RAID1 mail spool. Each machine would be responsible for its own Inbound SMTP and POP/IMAP/Webmail and use the local disk for the spool. Use lots of RAM for buffer cache to make sure your disk is hit less frequently. You might be able to centralize outbound SMTP. Once a machine fills up, you add another machine. This is one way to scale. The big boys in teh mailbox size wars (google, yahoo, hotmail) can't afford centralized storage for their mailboxes. Look for each to roll out racks of distribtuted storage where each storage server is a 1/2 U box with a couple large ATA disks in it. We might learn from this method of scaling. Before we take this costly step, what have you noticed for user / system loads before you start hitting the limits of your hardware? Yes. I serve 6000 users right now. They used to all be POP, and life was good. Now a significant percentage of my new customers use webmail, and I'm not happy with how my current web-based mail reading software scales. I may have to hack it alot to get it to perform well. Something that would help is if we rolled out spam/virus filtering out for everyone whih will cut 50% inbound mail and 10% viruses from being processed/stored/read and reread/reread/reread. BTW: I separate SMTP processing (/var/qmail local RAID1 fast SCSI with battery cache) from user mail spool storage (/home/vpopmail NFS mount to filer). Putting /var/qmail on the NFS server might be another source of overload. -- Eric Ziegast
Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
Eric Ziegast wrote: What we're seeing is that our network and RAID 5 IDE-based disk array on our central mail store server is not able to keep up with the 'client' servers doing the POP3, IMAP, Webmail, and SMTP legwork. I've found an interesting bottleneck with webmail. When people use POP or IMAP clients (Outlook, Mozilla, Opera, Thunderbird, etc.), the client application caches alot of the information locally and synchronizes occasionally with the server to see if there are new messages. Things like browsing and searching run eally fast because the user is utilizing the resources of their local PC to do most of teh work. With webmail, the session state is not saved nor cached, so with each new request, the mailbox can be rescanned. I think, if you use sqwebmail, it *will* cache some information. I've got a very large mailbox, with over 5 messages (though split in 100 directories) amounting to over 350 MB of mail, mostly mailinglists like this one. When I open a folder the first time in sqwebmail, it takes a lot of time, but the second time, it's rather quick (as quick as opening a folder with 3000 messages can be). I like sqwebmail, though I sometimes think I'm the only one and the rest of the world wants squirrelmail and IMP ;-) A relatively modest webmail application might only rescan all headers and show subject lines. A complex application might scan all content in a folder to present content more fully. Without anything to throttle back the webmail server, it's possible that the webmail server softwar can pound the mail spool server to death. I used to run a Qmail-based infrastructure for 4000 clients on a single slow machine without much memory. They used POP as their only pickup mechanism. We recently reimplemented on a Dell 1750 with two Xeon procs, alot of RAM and a GigE backend to a NetApp filer with 14 fast disks, and I STILL notice that the machine sometimes slowed down while people tried to read their 140MB mailboxes via webmail. sigh I put some bottlenecks on the search and retrieval algorithms of the webmail software to help protect the filer from a flood of queries, and we've been better since then. The power users with super-large mailboxes complain that it's slow, but now its a localized problem rather than a problem that affects everyone. Good tip. You can try to run up-imapproxy (if you don't do that allready) and see if it helps. It will try to cache at least the IMAP-sessions. Jeremy's comments are great for scaling the database, but it sounds to me that you're just maxed out on what you can serve over NFS. An SQL select might take at most a few kilobytes of data on the network whereas a webmail scan of a 30MB mailbox will take, well, 30MB. Doh! I'd also like to add that people perhaps overestimate what IDE-RAID can do compared with a true SCSI-RAID - especially in cases where a horrendous amount of small, scattered files and highly concurrent access is envolved (hello qmail). I always joke that nothing can beat a (current) IDE-disk when installing Windows and Office - they are optimized for rather large files and sequential access to these. But mail-spool (/var/qmail/queue/) and mail-storage (~vpopmail/) ain't an Office-installation So what to do? Instead of the centralized NFS mail spool (where the central spool becomes the bottleneck), you might consider splitting the user base across several machines. Each machine would have its own RAID1 mail spool. Each machine would be responsible for its own Inbound SMTP and POP/IMAP/Webmail and use the local disk for the spool. Use lots of RAM for buffer cache to make sure your disk is hit less frequently. You might be able to centralize outbound SMTP. Once a machine fills up, you add another machine. This is one way to scale. The big boys in teh mailbox size wars (google, yahoo, hotmail) can't afford centralized storage for their mailboxes. Look for each to roll out racks of distribtuted storage where each storage server is a 1/2 U box with a couple large ATA disks in it. We might learn from this method of scaling. I'd be interested to know how one can achieve this while still maintaining the single-system-image-nature that a central mail-storage with surrounding mysql-slaves provides. Not that I want to start a we're-bigger-than-google-kind of freakshow, but just in case I hit the wall with the current system. Before we take this costly step, what have you noticed for user / system loads before you start hitting the limits of your hardware? Yes. I serve 6000 users right now. They used to all be POP, and life was good. Now a significant percentage of my new customers use webmail, and I'm not happy with how my current web-based mail reading software scales. I may have to hack it alot to get it to perform well. Something that would help is if we rolled out spam/virus filtering out for everyone whih will cut 50% inbound mail and 10% viruses from being
Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
On Mon, 5 Jul 2004 [EMAIL PROTECTED] wrote: Before we take this costly step, what have you noticed for user / system loads before you start hitting the limits of your hardware? Should we be having these issues with about 15,000 email users and 5 front-end 'work' servers? Well, you're making me feel better about having only 3500 or so accounts on one box. The whole ordeal is making me re-think a few things about the design: -vpopmaild will be nice - no need to have webmail on the same box -mysqld works best on it's own box, period. -looking into front-ending with postfix will probably let me squeeze more out of the same hardware; qmail really thrashes the box around, especially if you have many over-quota users clogging up your queue with bogus spam bounces. Charles -Simon
Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
Rainer Duffner wrote: Eric Ziegast wrote: What we're seeing is that our network and RAID 5 IDE-based disk array on our central mail store server is not able to keep up with the 'client' servers doing the POP3, IMAP, Webmail, and SMTP legwork. I've found an interesting bottleneck with webmail. When people use POP or IMAP clients (Outlook, Mozilla, Opera, Thunderbird, etc.), the client application caches alot of the information locally and synchronizes occasionally with the server to see if there are new messages. Things like browsing and searching run eally fast because the user is utilizing the resources of their local PC to do most of teh work. With webmail, the session state is not saved nor cached, so with each new request, the mailbox can be rescanned. I think, if you use sqwebmail, it *will* cache some information. I've got a very large mailbox, with over 5 messages (though split in 100 directories) amounting to over 350 MB of mail, mostly mailinglists like this one. When I open a folder the first time in sqwebmail, it takes a lot of time, but the second time, it's rather quick (as quick as opening a folder with 3000 messages can be). I like sqwebmail, though I sometimes think I'm the only one and the rest of the world wants squirrelmail and IMP ;-) There is at least one other sqwebmail user out here... I like the fact that it directly accesses maildirs rather than opening a connection via tcp/ip to retrieve mail. Rick
Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
Charles Sprickman wrote: On Mon, 5 Jul 2004 [EMAIL PROTECTED] wrote: -vpopmaild will be nice - no need to have webmail on the same box vpopmaild has nothing to do with webmail, although I guess you could use it to retrieve mail through the back door. I'd suggest IMAP or POP3 based webmail instead. vpopmaild is for administration of mail accounts. Rick
Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
On Mon, 5 Jul 2004, Jeremy Kitchen wrote: Also, on top of that, I would consider disabling auth logging as it performs an insert/update upon every authentication which, no matter what will go back to your central mysql server, and if you have mysql being replicated, will be replicated to the front machines, which will almost nearly negate any performance increases you may (and very likely will) see by switching to a replicated mysql configuration. Interesting... Refresh my memory on this, is it a compile-time switch? How does vpopmail behave if that table does not exist? Also! (last one, I promise!) if you're using vpopmail's roaming users support, stop now. completely disable roaming users in your vpopmail configuration and set up Bruce Guenter's relay-ctrl package (http://untroubled.org/relay-ctrl). No funky cronjob to run, no patches required to ucspi-tcp (there's a patch out there to make it talk to mysql, eek) no central cdb file to rebuild upon connection attempts, AND it's safe to mount the spool directory on NFS (I've done it) as it doesn't require locking or anything. Hmmm... Good suggestion, is there anything similar that will deal with Courier's pop3d? Do you have a rough feel for at what point trying to decrease updates will help things along? 2000 users? 10,000 users? Thanks, Charles Hope this helps. -Jeremy -- Jeremy Kitchen ++ Systems Administrator ++ Inter7 Internet Technologies, Inc. [EMAIL PROTECTED] ++ www.inter7.com ++ 866.528.3530 ++ 847.492.0470 int'l kitchen @ #qmail #gentoo on EFnet ++ scriptkitchen.com/qmail
Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
On Mon, 5 Jul 2004, Eric Ziegast wrote: With webmail, the session state is not saved nor cached, so with each new request, the mailbox can be rescanned. A relatively modest webmail application might only rescan all headers and show subject lines. A complex application might scan all content in a folder to present content more fully. Without anything to throttle back the webmail server, it's possible that the webmail server softwar can pound the mail spool server to death. There's two things I'm testing. The first is Turck-MMCache, which is a caching lib for php, similar to the commercial Zend stuff. It's free though. It saves all the overhead of compiling your webmail stuff on every hit (it caches pre-compiled code). This of course assumes you're using a php-based webmail app. The other thing I'm looking at is an IMAP cache, which could probably help with the problems you're seeing as well. The squirrelmail list has some good info on this. I used to run a Qmail-based infrastructure for 4000 clients on a single slow machine without much memory. The machine that this mail is going through is an AMD K6-2-450. It has maybe 20 or 30 mailboxes, tops. For years it has been more than adequate for this small task, but with a few catchall domains and no chkuser patch (I don't trust it as it rejects mail if mysql is not zippy enough) it can really get bogged down during spam runs. I'm sick of building boxes to accomodate all the spam; I need to build for the 85% of mail that just gets thrown away by SpamAssassin. That's my spam rant. :) Instead of the centralized NFS mail spool (where the central spool becomes the bottleneck), you might consider splitting the user base across several machines. Each machine would have its own RAID1 mail spool. Each machine would be responsible for its own Inbound SMTP and POP/IMAP/Webmail and use the local disk for the spool. Use lots of RAM for buffer cache to make sure your disk is hit less frequently. You might be able to centralize outbound SMTP. Once a machine fills up, you add another machine. This is one way to scale. That's been my plan, but my problem is that most of my users are all under one domain, so I'm not really sure how I can divvy up the users without yet another box doing mx in front and splitting the mail up... The big boys in teh mailbox size wars (google, yahoo, hotmail) can't afford centralized storage for their mailboxes. Look for each to roll out racks of distribtuted storage where each storage server is a 1/2 U box with a couple large ATA disks in it. We might learn from this method of scaling. Hopefully one of them will do a nice Usenix presentation like Earthlink did back in the day... Thanks, Charles -- Eric Ziegast
Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
On Tue, 6 Jul 2004, Rick Widmer wrote: Charles Sprickman wrote: On Mon, 5 Jul 2004 [EMAIL PROTECTED] wrote: -vpopmaild will be nice - no need to have webmail on the same box vpopmaild has nothing to do with webmail, although I guess you could use it to retrieve mail through the back door. I'd suggest IMAP or POP3 based webmail instead. In my case, it will. I have a few custom-built squirrelmail plugins that rely on being able to run various v* commands, so at this time I can't put webmail on a seperate box. When I finally upgrade vpopmail to 5.4.x, I can rewrite all that stuff to work with vpopmaild... Thanks, Charles vpopmaild is for administration of mail accounts. Rick
Re: [vchkpw] NFS / Disk Access / Load Concerns on Vpopmail cluster
On Monday 05 July 2004 03:06 pm, [EMAIL PROTECTED] wrote: This mail store server is responsible for the store of the vpopmail/domains directory (NFS mounted by the 'client servers') and run SQL select queries to verify passwords... The read / writes on the disks are reaching the limit of the IDE RAID controller and the 100-base network. Assuming you're using this central machine also as your (only) mysql server for the cluster, there are several things you can do to increase the performance: 1) replicate the backend mysql server to the front machines and configure vpopmail with mysql replication support and have each 'node' talk to its own local mysql server. This will decrease the amount of traffic across the network, and the load on the nfs server TREMENDOUSLY. 2) set up a dedicated mysql server (or cluster) and perhaps even set it up on a different physical network from that of your NFS server (multiple NIC cards in each machine) and have vpopmail talk to that. I would recommend 1 over 2 as it's easier to set up and to 'migrate to' (you can do it while you're live) and doesn't require purchasing more equipment, and also decentralizes everything all at once! Also, on top of that, I would consider disabling auth logging as it performs an insert/update upon every authentication which, no matter what will go back to your central mysql server, and if you have mysql being replicated, will be replicated to the front machines, which will almost nearly negate any performance increases you may (and very likely will) see by switching to a replicated mysql configuration. Also! (last one, I promise!) if you're using vpopmail's roaming users support, stop now. completely disable roaming users in your vpopmail configuration and set up Bruce Guenter's relay-ctrl package (http://untroubled.org/relay-ctrl). No funky cronjob to run, no patches required to ucspi-tcp (there's a patch out there to make it talk to mysql, eek) no central cdb file to rebuild upon connection attempts, AND it's safe to mount the spool directory on NFS (I've done it) as it doesn't require locking or anything. Hope this helps. -Jeremy -- Jeremy Kitchen ++ Systems Administrator ++ Inter7 Internet Technologies, Inc. [EMAIL PROTECTED] ++ www.inter7.com ++ 866.528.3530 ++ 847.492.0470 int'l kitchen @ #qmail #gentoo on EFnet ++ scriptkitchen.com/qmail