David F. Skoll wrote:
On Fri, 27 Jan 2012 16:17:27 -0500
Kris Deugau<kdeu...@vianet.ca> wrote:
Swap... is not happening.<g> If anything, during mail spikes,
they're CPU-bound. The rest of the time they're mostly idle.
Do you have any custom rules?
Plenty. Before I did a review in ~November I think there were actually
more local rules than stock ones. I dropped quite a few rule groups
that weren't hitting anything, or only hit a couple of messages over
about a month, and it got back down to about half as many as in the
stock set.
Maybe one of them is exhibiting pathological
CPU behaviour triggered by certain messages?
... of course, the problem is then to find such a message. :/
And then to find what *other* necessary condition needs to come into
play for that message to trigger the lockup.
Very large text messages seem to be a key, in the cases where I've been
able to *find* a copy of a message that was fed to spamd just before a
lockup. But if we catch one of these lockups in progress, and restart
that spamd parent... during the next Postfix queue run the large
message gets processed normally. O_o
The handful of large messages like this that I've managed to capture
also process just fine (if slowly just due to running complex text
matching on 200K+ of text) when checked later, either through the live
spamd cluster, or in a test system.
-kgd