Mark, First of all, thank you so much for the amazingly thoughtful and in-depth reply!
--- On Wed, 9/15/10, Mark Martinec wrote: > Yassen, > >> I want amavisd-new/spamassassin use a different Spamassassin Bayes >> database for each separate domain hosted on my mail server. That >> is, if the first "To:" recipient is us...@firstdomain.com, then >> I want Bayes tests (and learning) to be done against SA Bayes databse >> #1; if the first "To:" recipient is us...@seconddomain.com, then >> Bayes tests (and learning) should be done against SA Bayes databse >> #2, and so on. If there is no "To:" recipient, Bayes tests (and >> learning) should be done against a default Bayes database. > > This is a wrong approach for anything but a toy or a SOHO setup. After your explanation, I see that clearly, no question... >> [...] Then I can use policy banks to tune amavisd-new the way I want >> it tuned for that specific domain, > > Policy banks apply to an entire message. They are an inappropriate > mechanism for controlling per-recipient behaviour. Policy banks are > typically associated with a sender or their IP address or authenticity, > and not associated with recipients (one policy bank, multiple recipients). Let me give a short background of my problem: I email-host half a dozen of domains and amavisd-new does a great job filtering the mail using clamav, SA, pyzor, razor and bayes (via SA). Bayes is a VERY helpful addition to the other tests and greatly improves the spam filtering success. What I noticed was that within a domain bayes works great, probably because legitimate mail within a domain tend to have a lot in common (also, spam tend to have things in common). The very contrary is true if I compare different domains with each other -- users of different domains use different languages, not to speak about other differences (I have English-speaking domains, German, Bulgarian.) This is the reason that I seek a solution to separate bayes database to somehow work "per domain" and not be a global one for the whole install. I guess the perfect solution would be to maintain a separate bayes db for each user, but the very good results for installations with a single db for a whole domain makes me believe that this is a good approach that will be a lot simpler and yet retain good quality. (Suggestions for different approaches are welcome.) >> but I still don't know how to get it tell SA to look for it's bayes db >> at a domain-specific location. Anyone's help is highly appreciated. >> >> My current plan is to introduce $sa_bayes_path in amavisd-new config >> file(s), have amavisd-new patched to honor that argument when calling SA, >> and also have it listen on a separate port for each domain. I will then >> use policy banks to tune that same $sa_bayes_path argument differently for >> each of the different ports (=domains). This didn't work for me; I guess because amavid-new passes parameters to SA only when instantiating it, that is, at startup time. So what I did was essentially what Vernon advised: (thanks, Vernon!) --- On Sat, 9/11/10, Vernon A. Fort wrote: > how about running each amavis with a different user account > with each having a different home directory. each home > directory would have a seperate .spamassassin/bayes* only I do not employ different unix users, I rather use the amavisd-new config files, basically having several configs that differ only in $MYHOME and $inet_socket_port. My postfix setup uses smtpd_recipient_restrictions = ..., check_recipient_access Didn't test this thoroughly, so not yet in production. Comments are welcome. > The 2.7.0-pre7 has a new infrastructure in place which makes it possible > to call SpamAssassin more than once per message, and even to load > different SpamAssassin config files based on a recipient address (or domain), > or based on a policy bank. It provides all the necessary internal support > for per-recipient SpamAssassin processing. If you are doing any work > in this area, the 2.7.0 is the codebase on which to ground any > development work. Sounds great (thank you for an amazingly useful piece of software!); I will download and look at it as soon as I can. (It sound like it won't need any hacking to do the thing I need.) > As it happens, the switching of SpamAssassin configurations between > messages (or even within a processing of a single mail message with > multiple recipients) is a rather costly operation. For the purpose > of switching a username used for Bayes SQL lookups it suffices to > tell SpamAssassin to switch a username without loading his preferences > config file. Such username switching is a fairly inexpensive operation. So I should consider using an SQL-based bayes database, correct? > What remains to be done is to map a recipient address to a (virtual) > username, then to group recipients (of a multirecipient mail) into > sets or recipients with a common username (such as his domain name), the domain name is what groups them according to my theory, yes. > then call SpamAssassin once for each username, and distribute > resulting scores back to each recipient as appropriate. Sounds exactly what I am trying to achieve! > This is a fairly straightforward change from the current 2.7.0-pre7, > based on all the already laid-down supporting mechanisms, and I guess > I can make it into 2.7.0-pre8 without too much trouble, if someone > is interested. I am one (obviously); anyone else voting here? Thanks again for your all your effort, Mark! Cheers, Yassen ------------------------------------------------------------------------------ Start uncovering the many advantages of virtual appliances and start using them to simplify application deployment and accelerate your shift to cloud computing. http://p.sf.net/sfu/novell-sfdev2dev _______________________________________________ AMaViS-user mailing list AMaViS-user@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/amavis-user Please visit http://www.ijs.si/software/amavisd/ regularly For administrativa requests please send email to rainer at openantivirus dot org