Eric Ziegast wrote:
What we're seeing is that our network and RAID 5 IDE-based disk array on our central mail store server is not able to keep up with the 'client' servers doing the POP3, IMAP, Webmail, and SMTP legwork.
I've found an interesting bottleneck with webmail. When people use
POP or IMAP clients (Outlook, Mozilla, Opera, Thunderbird, etc.),
the client application caches alot of the information locally and
synchronizes occasionally with the server to see if there are new
messages. Things like browsing and searching run eally fast because
the user is utilizing the resources of their local PC to do most of
teh work. With webmail, the session state is not saved nor cached,
so with each new request, the mailbox can be rescanned.
I think, if you use sqwebmail, it *will* cache some information.
I've got a very large mailbox, with over 50000 messages (though split in 100 directories) amounting to over 350 MB of mail, mostly mailinglists like this one.
When I open a folder the first time in sqwebmail, it takes a lot of time, but the second time, it's rather quick (as quick as opening a folder with 3000 messages can be).
I like sqwebmail, though I sometimes think I'm the only one and the rest of the world wants squirrelmail and IMP ;-)
A relatively modest webmail application might only rescan all headers and show subject lines. A complex application might scan all content in a folder to present content more fully. Without anything to throttle back the webmail server, it's possible that the webmail server softwar can pound the mail spool server to death.
I used to run a Qmail-based infrastructure for 4000 clients on a single slow machine without much memory. They used POP as their only pickup mechanism. We recently reimplemented on a Dell 1750 with two Xeon procs, alot of RAM and a GigE backend to a NetApp filer with 14 fast disks, and I STILL notice that the machine sometimes slowed down while people tried to read their 140MB mailboxes via webmail. <sigh> I put some bottlenecks on the "search" and retrieval algorithms of the webmail software to help protect the filer from a flood of queries, and we've been better since then. The power users with super-large mailboxes complain that it's "slow", but now its a localized problem rather than a problem that affects everyone.
Good tip.
You can try to run up-imapproxy (if you don't do that allready) and see if it helps.
It will try to cache at least the IMAP-sessions.
Jeremy's comments are great for scaling the database, but it sounds to me that you're just maxed out on what you can serve over NFS. An SQL select might take at most a few kilobytes of data on the network whereas a webmail scan of a 30MB mailbox will take, well, 30MB. Doh!
I'd also like to add that people perhaps overestimate what IDE-RAID can do compared with a true SCSI-RAID - especially in cases where a horrendous amount of small, scattered files and highly concurrent access is envolved (hello qmail).
I always joke that nothing can beat a (current) IDE-disk when installing Windows and Office - they are optimized for rather large files and sequential access to these.
But mail-spool ("/var/qmail/queue/") and mail-storage (Â~vpopmail/") ain't an Office-installation....
So.... what to do?
Instead of the centralized NFS mail spool (where the central spool becomes the bottleneck), you might consider splitting the user base across several machines. Each machine would have its own RAID1 mail spool. Each machine would be responsible for its own Inbound SMTP and POP/IMAP/Webmail and use the local disk for the spool. Use lots of RAM for "buffer cache" to make sure your disk is hit less frequently. You might be able to centralize outbound SMTP. Once a machine "fills up", you add another machine. This is one way to scale.
The big boys in teh mailbox size wars (google, yahoo, hotmail) can't afford centralized storage for their mailboxes. Look for each to roll out racks of distribtuted storage where each "storage server" is a 1/2 U box with a couple large ATA disks in it. We might learn from this method of scaling.
I'd be interested to know how one can achieve this while still maintaining the "single-system-image"-nature that a central mail-storage with surrounding "mysql-slaves" provides.
Not that I want to start a "we're-bigger-than-google"-kind of freakshow, but just in case I hit the wall with the current system.
Before we take this costly step, what have you noticed for user / system loads before you start hitting the limits of your hardware?
Yes. I serve 6000 users right now. They used to all be POP, and life was good. Now a significant percentage of my new customers use webmail, and I'm not happy with how my current web-based mail reading software scales. I may have to hack it alot to get it to perform well.
Something that would help is if we rolled out spam/virus filtering out for everyone whih will cut 50% inbound mail and 10% viruses from being processed/stored/read and reread/reread/reread.
But the spamassassin + antivirus -scanning also take a big toll - you'd probably still have to do what Jeremy suggested... ;-)
cheers, Rainer