Re: [courier-users] pre-filtering with Courier

Lindsay Haisley Mon, 22 Dec 2008 16:21:10 -0800

While I'm at it, Gordon, I extend my sincere thanks to you for your work
on courier-pythonfilter and its modules.  I've worked with perlfilter,
and even wrote a crude virus filter for it that I used for several
years, but when I discovered Python I never looked back :-)  I've
probably forgotten most of my Perl knowledge.  All the scripting I do
for my servers now is in Python.  Stuff written in Python just
works ....

On Mon, 2008-12-22 at 14:11 -0800, Gordon Messmer wrote:
> I don't see anything obvious that needs to be cleaned up in your patch. 
>   I will probably apply the it (except for the init message -- none of 
> the other filters print their configuration, and I'm weird about 
> consistency),

:-)  Probably with good reason.

>  but the benefit of the configuration item is probably 
> fairly small.  Adding the SA headers in the filter doesn't prevent you 
> from running SA again during local delivery and doesn't change the score 
> you'll get from doing so.

Yes, it may well change the score since the per-mailbox Bayes filter may
be more (or less) well educated than the global one, and there may be
tests appropriate or not at the global level that are different from
those in effect at the per-mailbox delivery level.  As things are now,
Courier is delivering (mostly) to virtual mailboxes and the SpamAssassin
username, associated with the ID that identifies Bayes data, user prefs,
etc. is [email protected] rather than just user (corresponding to a
system account).  This is one of the advantages of running SA with
configs in MySQL rather than flat files. It makes a lot of things _very_
convenient!

>   The only advantage you get is a very small 
> performance benefit from not re-writing messages that are small enough 
> to be scanned.  Compared to the actual scanning, saving the results of 
> the scan should be very fast, and a small fraction of the overall process.

The other thing that happens now, with SA analyzing emails per-mailbox
during delivery is that if SA identifies a spam, it re-writes the
message, putting it in an attachment prior to segregating it.  I like
this, and I'd probably lose this capability if I moved SA processing
forward out of the delivery phase and had to rely on the X-Spam-Status
or X-Spam-Level header to simply segregate the email, although I could
probably re-write the Subject header to mark it as spam.  I really don't
know how SA behaves if one runs something through it twice, and it sees
its own headers the 2nd time through.

> Beyond that, I'd caution you that you'll have to change the way some 
> things work in order for user preferences to be reliable in combination 
> with global filtering.  You'll need to aggregate all of your users' 
> whitelists (but not blacklists) in the daemon user's settings in order 
> to make sure that users receive mail when they whitelist an address, 
> even if the score exceeds your global limit.

I've been giving this some thought, but your point with regard to
whitelisting is one I hadn't fully considered (yet!)

I've set the global spam score limit to 10, and at this point I'm just
going to need to count on people being OK with whitelisting being
effective _only_ for email with a score less than this, since anything
bigger gets sent packing in SMTP.  Knowing my customers, I think they'll
be more than happy to get less spam, and if any of their correspondents
are sending them legit email with spam scores over 10 then we'll deal
with the problem as it comes up.  A whitelist is a whitelist, though,
and it wouldn't be all that difficult to write a python script that
would run from a cron job that would aggregate everyone's whitelist
entries on a daily basis, especially since all the whitelist entries are
in the same database table :-)  Or better still, I've got an unused
field in the spamassassin.userpref table (added_by) which could be used
to identify the "owner" of a whitelisting even though the username is
"courier" (the global SA user), so I could keep track of which users
have the right to modify which "whitelist_from" entries for which
mailboxes for the customer UI.

Blacklisting will still require SA to be run twice, of course, since
there's no reliable (logically consistent) way to implement this at a
global level per-mailbox.

-- 
Lindsay Haisley       | "In an open world,    |     PGP public key
FMP Computer Services |    who needs Windows  |      available at
512-259-1190          |      or Gates"        | http://pubkeys.fmp.com
http://www.fmp.com    |                       |

------------------------------------------------------------------------------
_______________________________________________
courier-users mailing list
[email protected]
Unsubscribe: https://lists.sourceforge.net/lists/listinfo/courier-users

Re: [courier-users] pre-filtering with Courier

Reply via email to