Lindsay Haisley wrote:
I have two servers. Currently they're both running instances of spamd
with separate mysql databases, however I'd like run both instances from
the same database on one of the servers. There are two ways to do this:
1. I can give the -d option to spamc where it's invoked in the mail
system, with the target being spamd on the master spamassassin server
via the VPN that connects the two boxes. spamd is already configured to
listen to it.
Mm, I don't think this does what you're hoping. spamd on any given
system will use the configured database (local or otherwise) - this is
**NOT** something the client can request.
From man spamc:
-d host[,host2], --dest=host[,host2]
In TCP/IP mode, connect to spamd server on given host
(default: localhost). Several hosts can be specified
if separated by commas.
This only affects which spamd server the client asks to process the
message; it doesn't affect any aspect of the actual processing.
2. I can let spamc invoke spamd on the local system but set the various
dsn params in secrets.cf to point to the MySQL database on the master
spamassassin server. The mysql server on this box is already listening
for queries from the other system via the VPN that connects them.
If all you're looking to do is use a single MySQL instance, then this is
your only choice.
Does anyone with some experience with spamassassin know which of these
two approaches would be better? Which would be fastest? Which would be
most conservative of bandwidth between the boxes?
A lot depends on the hardware you're using. If you're trying to squeeze
some last bits of performance out of a heavily-loaded system by
eliminating the SQL duplication, you'll probably have to tune the spamd
instances differently as well (eg, the system running MySQL won't be
able to support as many spamd children as the other one). You haven't
said what's in MySQL for SA; IME anything more than a couple of hundred
users suck up too much IO for per-user Bayes and/or AWL (not to mention
the staggering disk requirements - even at today's disk prices).
The cluster I'm doing most of my SA tuning on these days currently has 3
machines running spamd, and a fourth running MySQL (and some other
unrelated services, otherwise it would run spamd as well). Each machine
has the same SA config pointing to the same database on that fourth
machine - but clients don't see this, and can't affect it.
If the machines are not on the same local Ethernet segment, you're
probably better off leaving well enough alone, because any gains you
make in eliminating the SQL duplication will be lost waiting for data to
move across the network. Or worse.
-kgd