On 02/13/2018 11:45 AM, Horváth Szabolcs wrote:
Reindl Harald [mailto:h.rei...@thelounge.net] wrote:
I think I have no control over what is learnt automatically.
surely, don't do autolearning at all


This is a mail gateway for multiple companies. I'm not supposed to read e-mails 
on that, or picking mails that can be used for learning ham.
And I can't ask users to use a "ham" mailbox, because they are not IT experts, 
sometimes they have problems with a simple mail forwarding.


If you aren't allowed to check specific emails with a suspicious subject or that are reported as spam by your users, there's no way you can do your job of accurately filtering email.

Without autolearning and without the help of the end-users, I can't build a 
proper ham bayes database, can I?


SA's autolearning doesn't use the results from BAYES_* rules since that could make incorrect training even worse so you are going to have to build local rules or get help from RBLs and other SA plugins to get to the autolearning thresholds.

With non-English email flow, it's more challenging. If no RBLs hit, then you really must train your Bayes properly which requires some way to accurately determine the ham and spam. You must keep a copy of the ham and spam corpi and be allowed to review suspicious email.

Can you setup a split copy of the email that can redact the recipient or anonymize it enough to allow for review? If not, your filtering is not going to be accurate.

Best regards
   Szabolcs


--
David Jones

Reply via email to