On Wed, 11 Jan 2017 09:29:51 +0100
Matus UHLAR - fantomas wrote:

> >> On 10.01.17 10:48, Emin Akbulut wrote:  
> >> >Recently we receive spam messages and SA cannot block them.  
> [deleted]
> >> >Message source:
> >> >http://pastebin.com/nnN0jGw8  
> 
> >On Tue, 10 Jan 2017 10:43:40 +0100 Matus UHLAR - fantomas wrote:  
> >> clear case of mistrained BAYES causing message being marked as ham.
> >> you just have to re-train such spams as spam, it may take some time
> >> (not very long) until it starts hitting properly.  
> 
> On 10.01.17 14:13, RW wrote:
> >The pastebin example was auto-learned as ham, it may be hard to
> >counter this with manual training.  
> 
> depends... I found out proper trainning can fix quite fast

Since manual training unlearns before it relearns, it's feasible to
undo all the damage, but it's difficult to do that outside of a single
user database. If you don't catch them all, you aren't fixing it, you
are just working around the damage.


> >bayes_auto_learn_threshold_nonspam should be set lower.   
> 
> I agree, and would set that to -0.1 max. However this requires network
> checks on, since there are nearly no rules other than network and
> bayes with negative score.

And some of those are arguably pay-to-spam lists.

IMO there's no good way to autolearn ham unless you are prepared to
write enough local rules to positively identify it. It should be seen
as a last resort.

If you are in a position to train manually then IMO autotraining is
more trouble than it's worth, except perhaps augmenting manual training
with something like:  

 bayes_auto_learn_on_error 1
 bayes_auto_learn_threshold_nonspam  -1000

This lets Bayes do some useful spam learning in real-time, without
much risk of mistraining.

Reply via email to