Hi list, a little while ago I've set up a cute little mailfilter cluster, based on Postfix, Amavisd-new & Co. Even if mail traffic is not that high (peaks are slightly higher than 6 million delivery attempts a day) the most important tasks where keep it as scalable as possible and make all logging available to our support desk and possibly also to VIP customers on a dedicated web interface.
I'm really happy with the result, each single component is there at least twice. It doesn't matter what part is switched off (MX, Filter, DB) - it always keeps running. And these days not only support desk and VIP customers have "log access", each single account is able to see all rejected or quarantined mail in their Webmail frontend. My personal whish is to improve it even more, allowing me to scale "globally". Right now there is a lot of traffic between Amavis Instan- ces and their MySQL instances (Master-Master). In the current form transactions and replication would not allow to scale very far. Sure, we can add additional MySQL servers. We are able to do so on the fly, already successfully tested it in production (150GB on each host, stop connections to one of them, stop database there, copy it to a third host, set up replication, logs and log position - and let both hosts catch up. But even with 20 MySQL servers - the write load on each of them contiues to grow, and traffic between them could be heavvy. As I've managed it to solve the "web interface part" without using the amavis database (I need it only for quarantine access by ID) by setting up a central log aggregation system, there is only one task forcing me to keep the default amavis database structre: Penpals. I consider Penpals a really useful feature, and I won't miss it. However, while my log system is designed to scale horizontally, the way how Amavisd's database works forces me to keep that "centrical" approach. I really hope I've been able to explain the problem ;-) And please don't confuse this with horizontal database partitioning - that's something I'm using in production since it's available. But it doesn't solve the main issue, it just allows to scale "a little bit more". You have a lot of options to scale your SMTP's inbound path (by domain, by source IP, whatever) - however, you can not predict whether a reply to such mail will travel through the same site. Therefore, currently ALL sites need to be aware of ALL penpal information - and this currently means that ALL sites are required to access the same database (even if replicated and partitioned, it's still the very same DB - this way you don't scale out far). Here some possible steps / solutions I could immagine, just some unsorted thoughts: * Make Penpals modular, allow to use different (or even custom) modules for them * Use a hash based on your mboxes for partitioning - this could also help with the way it currently works. However, then you are no longer able to use partitions for garbage collection. And combining mbox & time for hash computation once again doesn't allow to scale Penpals. * Use Memcached - hashes based on MessageID and mbox allows you to always query the correct server. Memcached itself is not designed to be redundant, it's nothing but a cache. That's something I could live with - loosing part of my Penpals cache is not so critical. If you can't live with this: there are other similar implementations allowing to be redundant also here. Or you could use some locking-voodoo to make memcached also be redundant * Allow to configure Amavis "storage" in a way it would store just quarantine and leave away the msg/adr/rcp part. Where to look for a specific quarantined mail is something I could discover in my log files - log-parsing-based systems are able to scale. * I would not store to filesystem. Reason: garbage collection based on DB partitions is far cheaper. Allowing to "partition" your file-based quarantine by putting files in different subfolders (for example on a weekly base) could also be an option. That's all so far. I'm pretty sure I've forgotten something - but I'm confident one of you will for sure find it ;-) Your feedback is more than welcome, and it doesn't need to be positive - feel free to tell me why you consider the proposed approach braindead or whatever ;-p Kind regards, Thomas Gelf ------------------------------------------------------------------------------ Crystal Reports - New Free Runtime and 30 Day Trial Check out the new simplified licensing option that enables unlimited royalty-free distribution of the report engine for externally facing server and web deployment. http://p.sf.net/sfu/businessobjects _______________________________________________ AMaViS-user mailing list AMaViS-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/amavis-user AMaViS-FAQ:http://www.amavis.org/amavis-faq.php3 AMaViS-HowTos:http://www.amavis.org/howto/