I've been deploying a new system that uses SpamAssassin with an SQL database for user config, auto-whitelists, and bayesian databases.
I started off with Postgresql because that's the current database system we use. I was disappointed with the performance - it sometimes ran into minutes per message when autolearning. Normal processing was OK, but when a message was autolearnt it when hideously slow. At this point I tried MySQL. I was gobsmacked at the performance difference. Admittedly I thought it might go a bit quicker, maybe even twice as fast, but it was more like 10-100 times as fast. With postgresql the database server was maxed out on CPU and IO, and the spam server had a handful of spamd processes consuming about 5% CPU each. With mysql this changed to the spam server being maxed out on CPU and the database server practically idle. This seems to scale the way I want it too - many spam servers with one database. I guess this email is just for information really, unless someone steps up and finds a bug in SpamAssassin/Perl to attribute this problem to :-) I'd certainly recommend people use MySQL over Postgresql at this stage. Cheers, Tim. ----- Stats ----- Here are some stats. They're not entirely accurate because of varying test conditions and hardware, etc. But they put across my point... Postgresql: There were 2470 messages that took on average 54.8 seconds to process. 1337 messages (54.13%) were processed in less than 10 seconds. 2151 messages (87.09%) were processed in less than 120 seconds. MySQL: There were 7550 messages that took on average 4.2 seconds to process. 7418 messages (98.25%) were processed in less than 10 seconds. 7549 messages (99.99%) were processed in less than 120 seconds. ----- Stats ---- -- Tim Bishop http://www.bishnet.net/tim PGP Key: 0x5AE7D984
