Re: Strange SpamAssassin Statistical Performance

Jerome Cartagena 25 Feb 2005 19:00:55 -0000

Hello

There are 86400 seconds in a 24-hour day, and if it takes you 10 seconds per message (high but possible with large number of remote tests) with just one process (unlikely) then you are going to be capped at 8,640 messages per day at flat-rate (nobody gets perfectly-distributed traffic patterns, especially with email).

This is a perfectly reasonable explanation except for the fact that according to the graphs, the number of detected spam has a steady upper limit while the actual number of undetected spam fluctuates wildly. I am working under the assumption that all mail messages accounted for has already passed through spamassassin and has either been detected as "clean" or "spam". Thus, even clean messages are messages that have been processed. This means that if we are experiencing hardware specific limitations, the same behavior should apply to "clean" messages (the blue line in the graph). However, this is not the case.

Another point to note is that if we are experiencing hardware limitations on the amount of messages that can be processed, then mail messages will essentially be queued and will essentially await processing when cpu is available. This queueing process should be reflected on the graphs, but unfortunately it is not.

Here are a few additional information about the system: The server box we are examining has 4 CPU each running an average of (10-18% usage) with load average between .2 and .8 throughout the day. The box has 2 gigs of memory of which 82% of the memory is being used.

~Jerome Cartagena


On Feb 25, 2005, at 10:39 AM, Eric A. Hall wrote:

On 2/25/2005 1:28 PM, Jerome Cartagena wrote:
problem/question is that according to our statistics we are reaching some sort of upper bound on spam scanning performance. I have attached 2 files to help demonstrate what I am talking about. I am wondering if we are hitting some sort of performance limit on our mail scanning machines or is it simply the case that this is how much spam we are actively collecting. I'd appreciate any comments, ideas, or suggestions on a possible explanation regarding this situation.
Number of messages you can process per unit of time depens on several
factors, namely:
  * number of messages per unit of time
  * amount of time needed to process each message
  * units of time available
  * number of processes/processors available
  * peak variances
There are 86400 seconds in a 24-hour day, and if it takes you 10 seconds per message (high but possible with large number of remote tests) with just one process (unlikely) then you are going to be capped at 8,640 messages per day at flat-rate (nobody gets perfectly-distributed traffic patterns, especially with email).

There are secondary factors like memory and cpu availability that will affect processing capacity and you need to look at that too. But for starters, plug in your own traffic values to see what you should be aiming for in terms of target number of available processes at peak load, and that will get you started.

-- Eric A. Hall http://www.ehsco.com/ Internet Core Protocols http://www.oreilly.com/catalog/coreprot/

Re: Strange SpamAssassin Statistical Performance

Reply via email to