On Mon, 23 Jun 2003 [EMAIL PROTECTED] wrote:

> > -----Original Message-----
> > From: Andrew McNaughton [mailto:[EMAIL PROTECTED]
> > Sent: Monday, 23 June 2003 5:03 PM
> > To: James Gray
> > Subject: Re: [SLUG] Opinions sought: Exim vs Sendmail
> >
> >
> >
> > It sounds to me like you might want to allow more concurrent processes
> > than you are at present.  Also, the main resource you need more of is
> > going to be memory.  Have your bean counters taken in that an
> > extra 512MB
> > is really petty cash level expenditure?  In any case, is
> > there really any
> > problem with allowing more concurrent accesses?
> >
> > I agree with the suggestion about running your mail filter as
> > a daemon if
> > possible.  If this serializes your filtering then it should
> > help a great
> > deal.
> >
> > Andrew
>
> Spamassassin is already running as a daemon and chews up about 24Mb RAM
> just for the parent process - that's almost 10% of our physical RAM!.
> spamd children (according to vmstat + ps + top) are all reporting
> similar usage (23-27Mb RAM).  As you suggest the lack of RAM is killing

Most of that memory should be shared, so it's probably not quite as bad as
you're suggesting?

> us.  I did some burst load testing on it yesterday and found that
> without limiting child processes for spamd it only took 15 messages in
> under 5 seconds to sent system load to 15!  14 children of spamd caused
> so much paging that the system ground to a halt (load peaked at 19!!).

I don't know the ins and outs of how the spamassassin daemon does work,
but this is not how you'd want it to work. The daemon should limit the
number of children operating at any given time, with enough parallel
requests so that you're system has stuff to do during waits for remote dns
and checksum checks, but you don't load up on processes all competing for
CPU.

> So I did some quick calculations and decided 3 spamd children per CPU
> (with 256Mb RAM) would be appropriate (given average time per message,
> RAM, other process requirements etc).  I managed to send 50 messages in
> 7 seconds and system load hit 15 for less than 3 seconds and then
> quickly returned to <1.  Paging was non-existent and no connections were
> refused.  So startup scripts are now "spamd -m 3....."

Sounds better.

You've got a fundamental limit on the ammount of messages you can process
in a given time - each message takes a given ammount of CPU time, which
gives you a pretty good idea of how many messages you can process in a
given block of time.

At some level of CPU activity you would ideally want to change strategy:
stop doing mail filtering while the remote MTA is connected and start
putting mail into a spool.  You'd want this spool to be processed as
resources are available, and mail either delivered or bounced accordingly.
A bounce isn't as good as giving an error while you've got the remote MTA
still connected, but it allows you to process things at your leisure.

This delayed processing has other advantages also.  Spam checksum
systems take a short while to get spam reports, so delaying processing
can mean improved results.  I've been thinking of doing a procmail based
solution to this on my own mailbox somewhat after delivery.

How you'd set up this spooling I'm not sure.  I would very much like to
hear about such arrangements using exim or postfix.

Andrew


--

No added Sugar.  Not tested on animals.  If irritation occurs,
discontinue use.

-------------------------------------------------------------------
Andrew McNaughton           In Sydney
                            Working on a Product Recommender System
[EMAIL PROTECTED]
Mobile: +61 422 753 792     http://staff.scoop.co.nz/andrew/cv.doc



-- 
SLUG - Sydney Linux User's Group - http://slug.org.au/
More Info: http://lists.slug.org.au/listinfo/slug

Reply via email to