On Dec 23, 2011, at 8:15 AM, Robert Schetterer wrote:

> Am 23.12.2011 02:45, schrieb Marc Perkel:
>>>> This is handling ~250K messages/day, although with some tweaks to
>>>> serialize mail delivery a little more to level off the extreme peaks in
>>>> messages/second it should probably be able to handle a lot more volume.
>>>> 
>>>> We also have several SA instances - on the inbound side, the first pass
>>>> has ~25 of the top-scoring only-hits-spam rules (mostly DNSBLs) to skim
>>>> off the junk that would usually score 15+ on a full ruleset.  Anything
>>>> that gets past that is then passed to a full SA instance with a long
>>>> list of local rules targeted at the ones reported as missed spam by
>>>> customers.  That first pass tags more than 80% of the junk for far less
>>>> processing cost than feeding it all through the full ruleset.


We are processing 300k+ mails (peaks up to 1M/day) with 3 mail servers + 1 
dedicated MySQL server replicated to one old server and so far, we haven't seen 
any performance degradations by using Bayes in MySQL InnoDB engine. Mail 
servers are dual socket Xeon servers with 8G RAM, while MySQL server is 
dual-socket Xeon with 48G RAM, but SA Bayes is not the most used database on 
that server. We are using amavisd-new instead of spamd.

However, we've seen some degradations when we moved to new MySQL server, but 
some tweaking did help:
- correctly sizing InnoDB engine
- optimizing MySQL buffer sizes
- disable RAID battery autolearn period
- optimizing I/O scheduler
- optimizing network kernel stuff
- optimize kernel swappiness level
- using Mail::SpamAssassin::BayesStore::MySQL instead of 
Mail::SpamAssassin::BayesStore::SQL
- manually pruning auto-whitelisting data and bayes data

Currently our MySQL bayes data has over 2M tokens in place and we don't see any 
performance impact on SpamAssassin. Our backup setup runs on replicated 
database, so there is no performance impact on our primary MySQL server.

I don't have any numbers to compare MySQL and PostgreSQL, but I believe that 
newer versions of MySQL and its derivates (Percona Server etc.) did improve 
quite a lot, compared to older ones.

regards, Jernej



Reply via email to