> The client mail server would serve whatever combination
> I would like make a big server with qmail +vpopmail +mysql +procmail.
> I think in this structure:
> Server 1: Mx domain + smtp delivery +filters (Antispam, user filter(procmail)
> and antivirus)
> This server basically is the mail gateway of all domains, where is passed in
> the filters rules per domain and redirect all mails to server 2
>
> Server 2: Pop3 accounts + mysql server
> Here is created all accounts.
>
> This schema is good for multiple domains?
Based on my experience, I agree, but I might split Server2 into
Server2 (delivery/storage/database)
and
Server3 (pop/imap/webmail servers for clients).
I include more details below for one economical infrastructure I
worked with. It's not a HOWTO, but knowing what someone else has
done might help guide you instead of figuring it out from scratch.
> Another question is how do i do the message delivery the messages from
> server 1 to server2?
Qmail! :^)
In /var/qmail/control/smtproutes, set it so that all mail goes
to Server2 (eg: ":server2.mydomain.com"). If you're fancy, you can
try QMTP instead of SMTP.
--
Eric Ziegast
A sample large server environment (hundreds of domains, thousands of
users) I once helped with:
The MX record points to multiple cheap parallel inbound mail
servers:
- Single CPU PC at the best Price/Performance cost.
I've found that one can build these for $300 each. You will
find that when doing Virus/Spam scanning that the first
bottleneck that you hit (out of CPU/memory/disk/network) is CPU.
All of the regular expression searching on an e-mail message
takes processing power. Assuming you have enough RAM, disk I/O
would be the next bottleneck. I found a good balance AMD 1800+
motherboard /w 512MB PC133 RAM and 7200RPM IDE. Another option
is investing in a very fast multi-processor Intel screamer with
lots of RAM, but the cheap and disposable dervers are linearly
scalable.
- RAM depends on how many simultaneous connections you want to
be allowed for Spam/Virus filtering. I used 512MB on a cheap
system becasue RAM is cheap these days. I usually ran out of
CPU before memory. If the OS uses any significant amount of
virtual memory, you need more RAM. Run vmstat. If "pi" or "po"
is above 0, you need more RAM or need to lower the number of
simultaneous connections allowed by qmail (eg: "concurrencyincoming"
in /var/qmail/control). The inbound server is your "mail firewall"
and doesn't have time for paging to disk when the message load
is high.
- Hardware or softare RAID1 7200+ RPM IDE drives is sufficient.
I have been told by a Linux integrator that Linux software RAID1
can be faster than the RAID1 provided by hardware controllers.
If you have a budget for SCSI, use it. You need merely a
9GB drive in an inbound relay server anyway because the mail
doesn't sit on the server. In fact, you may see a disk I/O
improvement if you limit /var/qmail/queue to a 2GB partition
of the hard drive. If you don't need the space, you don't
need to have the disk head potentially cross the entire disk
to find data. If you select a hardware RAID controller,
prefer a controller that has non-volatile RAM or RAM /w a battery.
This will allow the controller to use write-back mode on write
and significantly reduce response time between the computer and
the hard drives.
- While I love OpenBSD and FreeBSD, I've used Linux for Qmail
services because I've had other Linux-capable staff that
could help administer the servers. Another advantage to
Linux is ReiserFS. I have used ReiserFS on /var/qmail/queue
partitions with success after applying the fsync patches.
(http://www.jedi.claranet.fr/qmail-reiserfs-howto.html)
ReiserFS performs well with thousands of files in a directory and
allows you to keep the default hash value (23) for the spool
directory. If using ufs (Solaris/BSD), consider compiling a queue
hash value of some large prime number (like 101). If using Linux
without ReiserFS, at least use ext3 instead of ext2 so that you can
recover after a crash. If using Solaris, consider VxFS if you
have the ability to use it. A standard fsck of a non-journaled
filsystems used for qmail REALLY sucks. Aside: I don't export
ReiserFS over NFS - just use it for the mail relays themselves.
For vpopmail directories, I use filesystems that are known to
be tried and tested in heavy read/write environments under NFS.
I hope ReiserFS gets to this state, but at the time of my
implementation, it was easier for me to use ext3 for vpopmail
dirs.
- I followed instructions for using QmailScanner /w SpamAssassin
(spamc -f -c) and a Virus checker. I found QmailScanner to
be quite inefficient and significantly rewrote it to not
break up the message into a zillion pieces for its internal
scanning. SpamAssassin (spamd) does that for you anyway, and
so do virus scanning software. I'd have the qmail-scanner
programs mark the messages with a "X-Spam-Status" and
"X-Virus-Status" header and forward the message to my central
mail server. Filtering software (your choice is procmail) on
the central mail server wouldn't have to scan messages, just
headers to figure out what to do with the message.
- One well-balanced server was able to handle 40 simultaneous
connections while filtering each message. I started with two
servers. If the load increased, I could add more. At $400 each,
using generic mid-tower boxes, it was easy to justify new boxes.
The central mail storage server:
- The vpopmail directories and mysql server should be servered
from a machine with a reliable motherboard and reliable storage.
CPU load isn't necessary. Memory isn't necessary. 512MB RAM
is still sufficient, and a 1GHz processor was still plenty. If
you have access to a Network Appliance back-end, I recommend it
highly, but if you're like me, you don't have much of a budget.
I would use fast SCSI 73GB drives (10000+ RPM) in a RAID1
configuration. I would have a hot-spare drive. I would have
NVRAM on the RAID controller if possible. If using Linux, I'd
use ext3 if NFS were required (ReiserFS is this is also my POP/IMAP
server). If I need more storage, I'd add more RAID1 pairs to
the SCSI chain. You main bottleneck is likely to be disk I/O.
Use iostat to see if read/write requests need to wait. If
- In one setup, I put mysql and final qmail delivery into a
large spool on one server. I then had POP3 clients and Web
clients access the spool via NFS. If I needed more I/O
between clients and theirmail, I culd add more cheap client
servers. I used hardware RAID0+1 SCSI storage on the main
Vpopmail home directory with ext3 (not ReiserFS because NFS
reliability for ReiserFS had not yet been proven). I used
73GB partitions. If you have a good budget, consider using
Network Appliance filers for your back end storage. They're
fast and reliable and is amazingly good at random disk I/O
when you use many spindles in a RAID group.
- If something catostrophic ever happens to your qmail/vpopmail
storage server, a user might live with an outage for a couple
hours, but not for a full day. I would have a backup IDE disk
that would be ready to to serve in the event that the primary
storage is unavailable. I would do rsyncs between the primary
storage and the backup drive as often as possible (at least
once a day). The directory structure with alot of e-mail
accounts would take alot of time to recreate. Having it
available on a disk nearby helps. I haven't had to use a
backup disk, but I sleep better knowing it's there. I know
the backup disk would be slower than primary storage and
add a bottleneck, but it's better than being down.
- If you're going to use procmail or other filtering software,
consider doing what you can to minimize the impact of the
filtering on the storage server.
- One method is having all filtering done on the inbound
servers and perform final mailbox delivery from the filtering
servers into an NFS vpopmail spool. Synchronization and
idempotency of NFS writes shouldn't be an issue for qmail,
but it could seriously lock up inbound delivery if NFS ever
fails. ("mount -o rw,soft,intr,noatime ...")
- Another method is having filtering and final delivery done
on the mail storage server itself. There's less network
traffic with this method compared to NFS, and qmail's queueing
allows the inbound servers to handle an outage on the main mail
server gracefully (sessions tempfail and queu up), but now
there's significant processing being done on the central mail
server. The central server could become a processing bottleneck,
and it's possible that a high enough load for procmail filtering
could make it less responsive to clients (POP, etc). You never
want to overwhelm your file server doing non-file-server tasks.
If you run procmail on the central storage server, at least use
"nice 10 COMMAND". I'm also wary of the security implications
of running promail on the same server central to mail processing
for everyone. What if someone introduces a program as part of
the procmail filters that has a security bug or sucks up all
available CPU on the server? I generally shy away from this
scenario.
- At one site, I implemented a batch processing system. Mail
would be "delivered" by vdelivermail program into a user's
mail spool on the central storage server. The user would not
be able to run any filters on inbound mail. By "delivered",
I mean that the message would exist in ~user/Maildir/tmp/MESSAGE#.
I would have vdelivermail stop before moving the message into
the "new" subdirectory and instead append the message delivery
information to a file that got rolled every minute. After a
periodic delay of at least one minute, another program on the
storage server would look at the list of new messages and serially
process them through user-controlled filers (a jailed perl script).
The customers are given some knobs for tuning (spam score threshold,
sender address, recipient address, content search) and the filtering
program would apply them to each batched message for final delivery.
In final delivery, the message is moved into "~user/Maildir/new"
or "~user/Maildir/junk" or some user-specified CourierIMAP folder.
The MySql database contains a list of user preferences for their
mail filtering with this program. The message is optionally
tagged by the filter with "spam" in case a POP user wants to use
that subject tag for filtering in their outlook mail client.
The web mail and POP and IMAP servers are simply scalable via NFS:
- Platform: up to 1GHz, 1GB RAM, single-disk PCs ($200) or more
PCs similar to the inbound relays.
- User vpopmail directories are seen via an NFS mount of the central
mail storage server.
- Install user-based web services (eg: pop3d, imapd, apache, sqwebmail),
and SSL wrappers. Note: The nubmer of simultaneous sessions depends
most upon available RAM. More sessions -> more RAM.
- This server could also be the outbound SMTP server for users.
One could take the POP logs and use them for POP-before-SMTP
authentication, or the qmail-smtpd service could be configured
for SMTP-auth with queries against the cenral MySQL server on
the mail storage server. With multiple servers, SMTP-auth
works better, but it is possible to create a summary of POP
traffic on all servers to build a central POP-before-SMTP
database.
Scaling:
- The database and storage server must be reliable. This
server is not linearly scalable, but if properly designed,
one can process a significant amount of storage before
needing to upgrade. To scale this out to multiple servers,
one needs a way of hashing/routing inbound mail to the correct
storage server - think of how "vdominfo -d" or "vuserinfo -d"
or /var/qmail/users/access are used by vdelivermail/qmail and
how one might be able to use symlinks to point users into
different NFS mounts. Adding a user or domain becomes more
complicated if one can't use a single directory tree for
their mail storage.
- The mail filtering servers must have good CPU, but don't
need to be reliable. They are linearly scalable.
- The client access processors don't need to be reliable if you
have some method for failover (IP address takeover, LVS) or
load balancing (Alteon, Foundry, BigIP, etc). They are
linearly scalable. Outbound smtp works better with a local
"dnscache" DNS resolver (http://cr.yp.to/djbdns) than a local
nameserver on the network.
This is an example that can service 20000+ users well.
One could decide to scale the single database/storage server
with some sort of server redundancy, or create clusters of
cheaper storage/database servers paired with relay/access
servers. One would need a method, though, of managing accounts
across multiple clusters.
Scaling to 1000000 users or more would involve commercial
storage systems (eg: Celerra or NetApp), load balancers
(eg: Alteon), hashing users or domains into multiple
clusters of access servers and storage/database managers,
and custom scripting to manage the creation and removal
of domains and users. One might also start looking at
the cost/benefit of commercial software packages when
dealing with large user bases.
"Life With Qmail" is your friend: http://LifewithQmail.org .
"Qmail-Scanner" is a good beginning Virus/Spam integration tool
for Qmail - http://qmail-scanner.sourceforge.net/ .