Hello

There are 86400 seconds in a 24-hour day, and if it takes you 10 seconds
per message (high but possible with large number of remote tests) with
just one process (unlikely) then you are going to be capped at 8,640
messages per day at flat-rate (nobody gets perfectly-distributed traffic
patterns, especially with email).

This is a perfectly reasonable explanation except for the fact that according to the graphs, the number of detected spam has a steady upper limit while the actual number of undetected spam fluctuates wildly. I am working under the assumption that all mail messages accounted for has already passed through spamassassin and has either been detected as "clean" or "spam". Thus, even clean messages are messages that have been processed. This means that if we are experiencing hardware specific limitations, the same behavior should apply to "clean" messages (the blue line in the graph). However, this is not the case.


Another point to note is that if we are experiencing hardware limitations on the amount of messages that can be processed, then mail messages will essentially be queued and will essentially await processing when cpu is available. This queueing process should be reflected on the graphs, but unfortunately it is not.

Here are a few additional information about the system:
The server box we are examining has 4 CPU each running an average of (10-18% usage) with load average between .2 and .8 throughout the day.
The box has 2 gigs of memory of which 82% of the memory is being used.


~Jerome Cartagena


On Feb 25, 2005, at 10:39 AM, Eric A. Hall wrote:


On 2/25/2005 1:28 PM, Jerome Cartagena wrote:

problem/question is that according to our statistics we are reaching
some sort of upper bound on spam scanning performance. I have attached
2 files to help demonstrate what I am talking about. I am wondering if
we are hitting some sort of performance limit on our mail scanning
machines or is it simply the case that this is how much spam we are
actively collecting. I'd appreciate any comments, ideas, or
suggestions on a possible explanation regarding this situation.

Number of messages you can process per unit of time depens on several factors, namely:

  * number of messages per unit of time

  * amount of time needed to process each message

  * units of time available

  * number of processes/processors available

  * peak variances

There are 86400 seconds in a 24-hour day, and if it takes you 10 seconds
per message (high but possible with large number of remote tests) with
just one process (unlikely) then you are going to be capped at 8,640
messages per day at flat-rate (nobody gets perfectly-distributed traffic
patterns, especially with email).


There are secondary factors like memory and cpu availability that will
affect processing capacity and you need to look at that too. But for
starters, plug in your own traffic values to see what you should be aiming
for in terms of target number of available processes at peak load, and
that will get you started.


--
Eric A. Hall http://www.ehsco.com/
Internet Core Protocols http://www.oreilly.com/catalog/coreprot/





Reply via email to