Re: Recommendations for ASF SA Implementation

Mark Martinec Fri, 20 Mar 2015 05:43:34 -0700

2015-03-17 22:16, Kevin A. McGrail wrote:

I am working on recommendations for the ASF to modernize the
installation of SA for the foundation.


We have some givens:

Using Ubuntu
Using Postfix
Need to stick with maintainable packages
Likely needs to stay away from lots of tweaks and heavy customization
such as using MIMEDefang (unfortunate).

So I'd like any input you might have, on or off list.



Axb wrote:

| Although I'd suggest Fuglu, the obvious choice should probably beamavisd-new

| considering Mark is also highly involved in SA dev work.

| It's also distributed by Ubuntu so it would be one package less tomaintain

| outside the distro. We'd get the best of both worlds.
| Axb

Thanks, Amavis would be my choice too :)))


back to Kevin:

Here's some questions I believe will help guide things:

Q1 - What is the best glue for SA for Postfix that does the following:
- uses spamc calls so that spamd's can be distributed and loadbalanced?


Amavis uses a standard protocol SMTP for communication with an MTA
instead of the proprietary spamc/spamd protocol. Other than that,
interfacing to the SpamAssassin is pretty much the same as in spamd,

i.e. uses pre-forked set of processes which use the SpamAssassinlibrary.

For this reason the performance is pretty much the same - the bottleneck
is processing rules in the SpamAssassin.

can be distributed and load balanced?


Yes, can be distributed and load balanced. Two approaches are most
apparent:
- the classical approach is to run multiple postfix+amavis
combos on several hosts, and let MX dns record distribute the load
across them. If a single IP address is desired, an SMTP proxy (such
as nginx) can do the task of load sharing in front of Postfix.
- if a single MTA is preferred with multiple content filters on
multiple hosts, then traffic from Postfix to amavisd instances
can be spread using HAProxy (or some other load balancer).

Note that it is beneficial to feed outgoing mail through amavis too
for the following reasons:
- the PenPals feature keeps track of ongoing conversations and
contributes negative score points to such, preventing some false
positives on marginal mail content (a requirement is a common
database for all amavis instances, preferably redis, possibly SQL);
- when SpamAssassin autolearning is enabled, outgoing mail
contributes its valuable share of ham samples;
- when an internal machine or a user mail account gets compromised
and starts spewing malware or spam, it will get blocked and detected.
- not to forget: to DKIM-sign outbound mail it needs to pass
through a signer. Amavisd can do DKIM signing (and verification).

- can implement clamav before SA call


Yes.

Also, considering that some of the third-party ClamAV rulesets
are prone to false positives, or intentionally target spam (not
viruses and other malware), amavis can be configured to reclassify
certain malware (by name) as spam, contributing to SpamAssassin score
and not blocking as malware right away.

- should silently discard emails if a virus is detected


Configurable, but you don't want to do that, and (as Reindl Harald
noted) may even be violating law. Unwanted mail must be rejected
at an SMTP level (or delivered to a dedicated folder or quarantine),
it must not be lost. Amavis is nowadays typically deployed as a
before-queue Postfix content filter so that it can reject mail
while the original session is still open.

Keep in mind that antivirus software does occasionally produce
false positives, ClamAV with third party rules even more so.
A legitimated sender must be notified is this happens.

- must use clamdscan but ideally can utilize some sort of socket
solution for clamd to run distributed and load balanced


Can do.  Amavisd cam interface with clamd either through
clamdscan, or (preferably) by directly talking to it over
the clamd protocol (thus eliminating clamdscan from the setup).
As this is a normal TCP connection, it can be load balanced
using HAProxy, although it probably makes more sense to keep
amavis+clamd pairs on each host.

- should bound email over a certain threshold (let's say 5) and
silently discard email over a certain threshold for SA (let's say 10)


Possible. There are a couple of configurable spam score levels,
each with its configurable action:

  tag level  - adds X-Spam-* headers (ham or spam)
  tag2 level - adds X-Spam-* headers, claims it is spam
  tag3 level - adds X-Spam-* headers, claims it is blatant spam
  kill level - (typically) rejects mail (or can discard or deliver)

Quarantining at each spam level is configurable independently.

- Might use a few RBLs to decline connections to start


Yes. That belongs to Postfix.

- Implements a good implementation of greylisting


That belongs to Postfix.
I tend to shy away from greylisting, it is much less effective
as it used to be initially. In my opinion it does more harm than good.

- Temporary failure for scanning (virus or spam) failures


Yes. Any fatal/unrecoverable failure causes a SMTP temporary failure
(4xx response either from amavis or from an MTA). No mail can get lost.

Q2 - Do we happen to know who maintains SA for Ubuntu so we can try
and work to make sure the upcoming release of 3.4.1 is packaged?


No idea. I thought the ASF infrastructure runs on FreeBSD mostly.

Here's the high level draft if anyone has some thoughts:

- Implement a cluster of spamd servers with no Bayes but likely using
SQL prefs for some whitelist/blacklisting - Bayes not being used
because training and maintaining will likely be too difficult


I find bayes with autolearning very valuable (using redis backend,
mostly maintenance-free). Probably not so good at some general public
mail provider, but certainly good for a scope of users sharing mostly
technically oriented / common interests mail.

- Implement txrep with SQL backend


Haven't tried txrep yet.

- Implement a cluster of clamav boxes


ClamAV is usually faster that SpamAssassin. I'd keep several instances
of amavis+SpamAssassin+clamd (with or without a Postfix instance)
on multiple hosts if the load is really that high.

- Implement an SPF record


Yes, an unfortunate fact of life.

Not to forget, DKIM signing is essential, must be done *after*
mailing list fanout.

- Implement postfix with xyz glue to test email on a scalable # of mx's


Sure.

- Implement a few RBLs to block SMTP connections - I hate to recommend
this but ASF members are very sensitive to spam so I'm treading
lightly


Some high-quality RBLs at an MTA level are desired.
Postfix even implements weighting with a threshold
over multiple RBLs if desired.


For a high-level view on Amavis see the Wikipedia article:

  http://en.wikipedia.org/wiki/Amavis


Perhaps I should point out some more features that I find valuable:

- amavis can block mail based on declared MIME content type or MIMEname,

  or based on a MIME part's content as classified by a file(1) utility.
  This helps with first waves of malware before virus scanners get their
  signatures updated, e.g. block MS executables;

- produces detailed logging in JSON (in addition to syslog). JSONloggingcan be valuable for effectively feeding intoElasticsearch/Logstash/Kibana

  or into Splunk or other log analyzers;
- large mail (over the SpamAssassin's limit) is not just blindly passed,

but a truncated section of mail is passed to SpamAssassin forevaluation,with DKIM signature checks already done on the full pristine mailcontent,

  so that truncation does not invalidate signatures, yet in many cases
  SpamAssassin can still do its job reasonably well.


Mark

Re: Recommendations for ASF SA Implementation

Reply via email to