On Wed, Feb 09, 2005 at 11:12:23AM +0100, Arvinn Løkkebakken wrote: > Has anyone measured the difference in performance? >
There are benchmark results available for the single server with mysql on a localhost. NOTE: that the DBM tests were done with a local database with lock_method flock, and not over NFS. http://wiki.apache.org/spamassassin/BayesBenchmark I've done multi-system benchmarks but not in a controlled enough environment to be able to publish the results. Lets just say that they are good, with the usual expected network latency overhead added on. If someone wanted to offer up a testbed of multiple machines I'd be willing to spend the time testing various configurations. > > I guess it's obvious that the setup will perform better with bayes in > SQL when several spamd servers are in use, but what if it's just 1 spamd > server? The short story is, for SQL learning is slower than DBM and scanning is faster. Plus you lose, well at least push into the DB layer a lot of the lock contention issues you get with DBM. > How does the sice of the database matter, e.g. will SQL perform better > when the database is bigger? This will largely depend on your DB tuning. I've heard of multi-gig and multi-hundred gig bayes dbs. IMO (of course I'm a little biased) the DB format is very efficient (fixed length rows and all that) and very fast. I've personally never run with a DB more than a couple hundred megs. Will MySQL handle a large DB better than DB_File? I have no hard evidence to back it up, but it should, right? Michael
pgpYCMWkGIn54.pgp
Description: PGP signature