Re: Share bayes database between servers

2023-07-09 Thread Matija Nalis
On Sun, Jul 09, 2023 at 07:06:10PM +0200, Robert Senger wrote:
> I've set up a testing environment that also uses master-master
> replication of the mysql bayes database, with priority in dns set to
> equal for both mx to get incoming mail distributed evenly to both
> systems. So far, this seems to work, but this is a low load
> environment.

it boils down on how much you trust mysql master-master replication
stability and performance, which is heavily dependent on your
experiences and exact versions used (are we talking about Oracle
Mysql, or MariaDB or Percona forks? which versions? What replication
setup? etc.)

I've had problems under high concurrent load (not performance, but
replication setup breaking) in the past, so I prefer to avoid
master-master replication if possible, especially if I anticipate
high concurrent load.

But if you are confident in it, sure, go ahead.

> Any suggestions?

Well, how are you training your bayes DB? If it is via cron and
manually curated ham/spam corpuses (the recommended way), I'd rather
suggest keeping databases separate and simply running training on
both servers (you can duplicate or share ham/spam corpuses as you wish,
from rsync to SMB/NFS).

If you are using auto-learn (which was not recommended last time I
looked), well, you'd probably better off NOT syncing bayes at all
IMHO, as it should be prefered that risk of bayes poisoning is
reduced to one server instead of replicating that (and there is not
much benefit, as auto-learn will quickly learn on each server
separately anyway, and if one set of domains is not getting some type
of spam, it is not beneficial to learn it anyway)

-- 
Opinions above are GNU-copylefted.


Re: Best practice for adding headers?

2023-07-09 Thread Robert Senger
Am Sonntag, dem 09.07.2023 um 13:55 -0700 schrieb Loren Wilton:
> > I've patched spamass milter to let any previously added "X-Spam"
> > headers untouched
> 
> Its generally considered bad practice to pass thru X-Spam headers
> from an 
> unkonwn source.
> Like most anything else in an email header, a spammer could inject
> his own 
> headers, probably populated with items designed to generate a
> negative 
> score.
> 

Sure, but updating headers in place and adding own headers somewhere
else like spamass-milter is doing it is also bad practice in my eyes...

I've seen that other milters (clamav-milter in particular) offer an
option to either keep or remove existing virus scanning headers. 

Since I need to patch spamass-milter anyway to resolve a different
issue (calling "sendmail -bv " does not work on postfix
systems), it should be easy to add such an option to spamass-milter.

Regards,

Robert

-- 
Robert Senger





Re: Best practice for adding headers?

2023-07-09 Thread Robert Senger
Am Sonntag, dem 09.07.2023 um 19:23 +0200 schrieb David Bürgin:
> Hello Robert,
> 
> > Now, I am a bit uncertain about what would be the best practice for
> > a
> > milter to place its headers.
> > 
> > I've patched spamass milter to let any previously added "X-Spam"
> > headers untouched, and just add its own headers on top of the
> > header
> > list as required by spamassassin's results, thus leaving it up to
> > the
> > downstream software to choose which "X-Spam" headers to use for
> > furter
> > processing. This is okay for me.
> > 
> > In its original code, spamass-milter adds its own headers to the
> > bottom
> > of the header list, or updates existing "X-Spam" headers in place
> > if
> > their names match those spamass-milter uses. 
> > 
> > What do you think?
> 
> I can’t speak for spamass-milter, but in an alternative milter that I
> created¹, I tried to emulate what the ‘spamassassin’ executable does:
> Delete all incoming X-Spam- headers, and insert the newly added
> headers
> at the top.
> 
> Ciao,
> David
> 
Thanks David, never heard of spamassassin-milter before (it's not in
the Debian repos), but I'll give it a try as there seem to be more
issues with spamass-milter.

Robert

> ¹ https://crates.io/crates/spamassassin-milter

-- 
Robert Senger





Re: Share bayes database between servers

2023-07-09 Thread Robert Senger
Am Sonntag, dem 09.07.2023 um 19:21 +0200 schrieb Reindl Harald:
> 
> 
> Am 09.07.23 um 19:06 schrieb Robert Senger:
> > But bayes data may be updated by either the primary mx or the
> > backup
> > mx, since email may arrive at either server.
> 
> in a smart setup your bayes-database is read-only like here since
> 2014, 
> any autolearning disabled and strictly trained manually by a stored 
> corpus giving you the opportinity removed and add messages to the 
> training folders and revuild from scratch
> 
> we share our bayes-db even with a different company since 2014

Well, that's the boring solution... ;) Nevertheless, this is what I
will likely do if I encounter any problems with the mysql master-master
replication as I have it running now.

Robert

-- 
Robert Senger





Re: spamd runs as root on Fedora Server 38 ?! - was Re: Newb on sa-learn - didn't get what I expected as a response...

2023-07-09 Thread Bill Cole
On 2023-07-07 at 12:08:22 UTC-0400 (Fri, 7 Jul 2023 09:08:22 -0700 
(PDT))

Richard Troy 
is rumored to have said:


Hi All,

I changed the subject line to hopefully get some insight from a wider 
audience regarding this situation that Reindl uncovered:


It should be noted that Harald Reindl is not a subscriber to this list 
and cannot be as a result of past behavior. Nothing can stop him from 
reading public archives and replying directly to list members, but no 
one else sees them.


SpamAssassin can operate in many different modes. How distribution 
packagers chose the 'default' for their installations is beyond the 
scope of the SA project per se, and the specific packagers should be 
consulted if you need an explanation of their choices.


If you want spamd to be able to access the per-user preferences and 
databases for  AWL/TxRep and/or Bayes of real system users, spamd must 
run as root OR you must devise another working configuration which 
allows that to work. This can be avoided by using virtual users or 
storing per-user configuration in a database rather than in files on 
disk. You can also dispense entirely with spamd and have a milter like 
MIMEDefang call the SA libraries directly, but you still need 
*SOMETHING* running as root (or a semi-privileged user) if you want to 
use per-user configuration living in a POSIX filesystem.


Arguing over which model is better is pointless, because they are chosen 
based on local needs. Scolding people for their choice of the reasonable 
options is just silly.


I should probably add that I personally don't do per-user config because 
of the enlarged attack surface it presents and small marginal value, but 
that's guided by local details. I work with systems owned by others 
where other choices were made for very sound reasons and they have not 
had security problems with it, in many years of operations. What you 
choose to do should be based on what YOU want.



--
Bill Cole
b...@scconsult.com or billc...@apache.org
(AKA @grumpybozo and many *@billmail.scconsult.com addresses)
Not Currently Available For Hire


Re: Best practice for adding headers?

2023-07-09 Thread Loren Wilton

I've patched spamass milter to let any previously added "X-Spam"
headers untouched


Its generally considered bad practice to pass thru X-Spam headers from an 
unkonwn source.
Like most anything else in an email header, a spammer could inject his own 
headers, probably populated with items designed to generate a negative 
score.




Re: Best practice for adding headers?

2023-07-09 Thread David Bürgin
Hello Robert,

> Now, I am a bit uncertain about what would be the best practice for a
> milter to place its headers.
> 
> I've patched spamass milter to let any previously added "X-Spam"
> headers untouched, and just add its own headers on top of the header
> list as required by spamassassin's results, thus leaving it up to the
> downstream software to choose which "X-Spam" headers to use for furter
> processing. This is okay for me.
> 
> In its original code, spamass-milter adds its own headers to the bottom
> of the header list, or updates existing "X-Spam" headers in place if
> their names match those spamass-milter uses. 
> 
> What do you think?

I can’t speak for spamass-milter, but in an alternative milter that I
created¹, I tried to emulate what the ‘spamassassin’ executable does:
Delete all incoming X-Spam- headers, and insert the newly added headers
at the top.

Ciao,
David


¹ https://crates.io/crates/spamassassin-milter


Best practice for adding headers?

2023-07-09 Thread Robert Senger
First of all, thanks for your help!

Now, I am a bit uncertain about what would be the best practice for a
milter to place its headers.

I've patched spamass milter to let any previously added "X-Spam"
headers untouched, and just add its own headers on top of the header
list as required by spamassassin's results, thus leaving it up to the
downstream software to choose which "X-Spam" headers to use for furter
processing. This is okay for me.

In its original code, spamass-milter adds its own headers to the bottom
of the header list, or updates existing "X-Spam" headers in place if
their names match those spamass-milter uses. 

What do you think?

Robert


Am Mittwoch, dem 05.07.2023 um 01:38 +0200 schrieb Robert Senger:
> Hi all,
> 
> is there a reason why spamassassin adds its "X-Spam ..." headers to
> the
> bottom of the header block, not to the top like every other mail
> filtering software (e.g. opendkim, opendmarc, clamav ... ) does? Can
> this behavious be changed?
> 
> Regards, 
> 
> Robert
> 

-- 
Robert Senger





Share bayes database between servers

2023-07-09 Thread Robert Senger
Hi there,

I am running two mailservers, first one serving two domains, other one
serving one domain.

Both serve as backup mx for each other. Both know about users and
aliases of the other domain(s).

On both systems, spamassassin is configured to read/store userprefs and
bayes data (per user) in a local mysql database.

Both systems reject email if the score exceeds a certain limit. To
avoid backscatter (or the need to accept any spam not rejected by the
backup mx), both servers should do their spam filtering based on
exactly the same information, including bayes data.

Now, the question is, what is the best way to share bayes data between
two (or more) servers?

I already share userprefs by setting up master-master replication
between the two mysql databases on both servers. This is uncritical,
since users (or admins) will update only userprefs for the local
virtual users on each system, which means, backup mx will never touch
primary mx userprefs.

But bayes data may be updated by either the primary mx or the backup
mx, since email may arrive at either server. 

I've set up a testing environment that also uses master-master
replication of the mysql bayes database, with priority in dns set to
equal for both mx to get incoming mail distributed evenly to both
systems. So far, this seems to work, but this is a low load
environment.

Any suggestions?

Regards,

Robert


-- 
Robert Senger