John Hardin wrote:
On Tue, 31 Jan 2012, John Hardin wrote:
You posted this command line:
/usr/local/bin/spamd -d -x -q -r /var/run/spamd.pid --min-children=59
--min-spare=1 --max-spare=1 --max-conn-per-child=100 -m 60 -s local1
-u spamd --timeout-child=60 -i 0.0.0.0 -A <IP list> --syslog-ident
spamd/main
Why don't we see something like "prefork: child states:
BIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII"?
A good question. After a major burst of mail has bumped up the number
of children, yes, I *do* see something like that in the log.
But most of the time, there are 6-8 children running.
...to answer my own question, max-spare overrides it?
Hm, I wondered if that might be the case.
Is there some reason you're setting max-spare to 1 instead of leaving it
at 2 or setting it to something like 5?
I think the *idea* was to prespawn all the child processes (several
generations of hardware ago, and ~SA3.0.x [possibly 2.6], there was
evidence that spamd wasn't (re)spawning new children fast enough to keep
up with *normal* mail flow, never mind spikes). Watching the logs now
is enough evidence that spamd is coping quite properly with scaling up
as needed, so we're probably overspecifying its behaviour and could drop
all but the -m 60.
I've bumped --max-spare to 5 on one system just because.
OTOH... I'm having trouble seeing where this could cause the whole
spamd process tree to lock up completely for ~15 minutes. (It locks
hard enough that a monitoring process gets "connection timed out", even
though there are only 5 spamd children running - as per an incident just
after noon today.)
On the third hand... if there *is* a subtle bug in spamd's process
scaling, is it worth jumping *way* out there and trying --round-robin?
I'd turn on debug logging... but we already generate ~450M of spamd
logs daily between the two machines.
-kgd