Am 22.09.2016 um 10:16 schrieb Thomas Barth:
Am 21.09.2016 um 18:47 schrieb Bowie Bailey:
That is ridiculous. The more training bayes gets the better it works.
And manual training is better than autolearning because autolearning can
automatically learn false positives and false negatives and cause
problems for the database.
And what about filter poisening? In the last 10 hours my company address
got 43 mails classified as spam (even a virus mail detected today). And
there was one mail classified as spam due to my rule (bad country,
Your payment has been approved. Your account will be debited within two
You can email us for any query regarding your account.
There is no spam content, am I right? Normal words and content that a
normal person can use. I dont need spam learning for all the mails
already classified as spam with high score. Spam with low score are
interesting for spam learning like this one. But when I use these mails
for spam learning there is a risk of false positive some day, because it
has learned that normal mails are also spam?
no you are not right - that *is spam content* and has nothing to do with
bayes poisioning - in fact that are malware messages - known by our
bayes for at least 12 months and already BAYES_99 stuff will not be trained
it's the job of the bayes filter to find the minimal but existing
differences and mistakes between that and similar ham and *hence*
autolearning won't work in general because you need still to decide and
classify the border cases
bayes poisioning can become a problem and is *another* reason why you
train you filter manually instead let him decide itself and if it once
decided wrong learn more and more in the wrong direction
but that above is NOT bayes poisioning