In the first email:

  # The lock file ensures that only 1 spamassassin invocation happens
  # at 1 time, to keep the load down.
  #
  :0fw: spamassassin.lock
  * < 400000
  | spamc -x

Kevin A. McGrail wrote:
> geoff.spamassassin140903 wrote:
> > Kevin A. McGrail wrote:
> > > Using procmail without MTA glue is OK for many uses.  I am wondering how
> > > many spamd connections you allow and if you have checked your logs?
> > >
> > > I also cannot remember but the uses of a lock file seem odd for
> > > something that can thread.  Any one know if that is a good idea to
> > > remove?
> >
> > I wonder if you could explain in simple terms what the lockfile achieves
> > in this situation? Is it even possible that it could cause messages to
> > bypass SA?
>
> I don't think a lockfile achieves anything because it's a call to a program.
> Procmail has some weird syntax so hopefully someone with some procmail-fu
> can tell us if a lock on a procmail system call does anything.

Well...  The comment in the example explains what the lock is
attempting to do.  I think that comment got missed in the follow-ups.
The lock will restrict spamassassin invocations to one at a time to
prevent a high system load average running too many spamassassin
processes all at once.  It will serialize spamassassin invocations to
one at a time instead of many in parallel.

Normally the MTA will receive incoming messages and will fork a
process for each incoming connection.  If the outside world connects
and sends 100 messages all at once then there will be 100 MTA
processes running in parallel.  If 10,000 all at once then probably
some MTA process limit will prevent forking that many depending upon
your configuration.  Each of those will try to send the message
through procmail and spamassassin in parallel too.  Running 10,000
procmail processes in parallel probably won't be a problem since it is
light weight.  However running perl spamassassin 100 or 1,000 times in
parallel all at once can be quite a resource hit to a moderate system!

By putting the lock in the procmail rule it prevents more than one
perl spamassassin process from running at a time.  This keeps the
system from being overloaded due to a spike from the outside world.  I
want to emphasize that the outside world impacts the system and can
have an effect of a DDoS just by overwhelming the system with external
connections.  The MTA has limits to prevent this but while those are
tuned for normal delivery the MTA maintainers won't know if you are
running each message through spamasassin and causing a higher load
because of it.  The default MTA limits are probably too high when
considering running the message through spamassassin too.

The procmail example comes from the wiki page example:

  http://wiki.apache.org/spamassassin/UsedViaProcmail

The wiki page example is launching "spamassassin" not "spamc".  That
is an important difference to this case.  Someone has changed that to
spamc in the above and preserved all else including the serialization
lock.  The spamc talks to a spamd and so the number of parallel
processes spamd can handle depends upon the spamd configuration.  In
the spamc use I would be inclined to remove the serialization lock.
Let it be throttled at the spamd side of things instead.  That would
make the most sense to me.  Then tune spamd's limits as needed.

In summary I suggest removing the serialization lock from the spamc
recipe.  Give it a try and monitor system resource utilization.  Start
tuning at spamd.  Tune other things as needed afterward.

  :0fw
  | spamc -x

  :0e
  {
    EXITCODE=$?
  }

Bob

Reply via email to