On Mon, 18 Jun 2018 10:13:04 -0600
@lbutlr wrote:

> On 18 Jun 2018, at 08:47, RW <rwmailli...@googlemail.com> wrote:
> > On Mon, 18 Jun 2018 06:13:06 -0600
> > @lbutlr wrote:
> >   
> >> I have a script that runs when a mail is moved out of the Junk
> >> folder to pass the mail through sa-learn --ham,   
> > 
> > 
> > Whether this is the Dovecot plugin or something local it's a poor
> > way of training Bayes. You're training on SA errors not Bayes
> > errors. Most imperfect Bayes results don't translate into
> > misclassifications.  
> 
> I’m not sure what you’re trying too say here/ Certainly SA does
> misclassify mail as spam at times, ...
> Training the messages as ham is useful.

The problem is that, unless there is something badly wrong, a typical
single user account wont generate enough FPs and FNs for a properly
trained database. I found that Bayes's identification of ham improved
until I'd trained about 1500 ham, but I wouldn't expect to get anything
like 1500 SpamAssassin FPs in a lifetime. 

It's not even proper train-on-error because it's training on
SpamAssassin misclassifications  and not correcting Bayes's own
errors. It allows Bayes to go uncorrected until it results
in an FP or FN.

You can work around the plugin's deficiencies by using autotraining or
doing some additional training, but then the plugin is of limited
relevance.

IMO the plugin is best left to statistical filters like DSPAM.

Reply via email to