Am 02.10.2015 um 19:15 schrieb Andrew Davidson:
I'm not an expert on the mechanics of Bayes so I'm wondering how valuable it is to continue training with collected spam that is properly tagged with BAYES_999. Does that help to reinforce the logic or is it overly focusing the database on emails it can already detect? Should I only be training it with miscategorized emails and emails in the 20-80% confidence range?
yes, because it contains clear spam parts repeated in the future in parts, doing that here for many months now and the results get better and better - we have a BAYES_00 of 85% of all scanned messages by heavily train ham as well as spam
0 51534 SPAM 0 19007 HAM 0 2161267 TOKEN
signature.asc
Description: OpenPGP digital signature