Am 18.03.2015 um 23:34 schrieb RW:
On Wed, 18 Mar 2015 22:46:14 +0100 Reindl Harald wrote:frankly i trained over months with *hand chosen* mail smaples and spent nearly two weeks day and night to remove bayes-posioning from the samples and rebuild bayes from scratch leading in reduce the ntokens from 1700000 to 1500000Why did you remove the Bayes-poison?
because now BAYES_00 in case of legit mail is at 87% of all scanned messages, BAYES_50 dropped from 10% to 4% and the milter-rejects are still at around 8-10% with just 10 instead 150 flagged message on a userbase with 1200 vaild RCPT's
because finally the bayes has a quality that it needs few to no further training at all in combination with other filters
over the long the poision leads in more and more legit mail becoming a higher score as deserved, the FP rate increases and at the end you need to lower the reject score passing more junk because user complaints - at that point the spammers won, you need to reset bayes sooner or later and start from scratch with training
that's not theory, i observed that behavior over many years with commercial appliances using SA behind the scenes and enabled auto-learning
signature.asc
Description: OpenPGP digital signature