http://bugzilla.spamassassin.org/show_bug.cgi?id=3147
Summary: Bayes training should recognize SA message format and
unwrap the spam before processing
Product: Spamassassin
Version: SVN Trunk (Latest Devel Version)
Platform: Other
OS/Version: other
Status: NEW
Severity: normal
Priority: P3
Component: spamassassin
AssignedTo: [EMAIL PROTECTED]
ReportedBy: [EMAIL PROTECTED]
If you have SA set up to wrap all detected spams as attachments, you currently
have to manually unwrap the attachment and hand it to SA for Bayes learning.
This is an annoyance on a single message, but with the new ability to learn by
peocessing an mbox it will be a real pain.
The Bayes processing, either single message or mbox, should recognise that a
spam or ham has been wrapped as an attachment by a previous SA run, and only
learn from the attachment.
This needs to be automatic detection on a per-message basis, as some of the
spam may have made it through and thus not be wrapped (and in fact have headers
indicating it is NOT spam, which should also be removed), and some of the spam
will have been detected by rules that overrode a Bayes ham score for the
message. In either case Bayes should be trained with the original message.
Obviously the same goes for learning ham, where very probably much stuff passed
in will be stuff mis-detected as spam.
Perhaps there needs to be an option to enable, or more reasonably, disable this
automatic detection and unwrapping/header stripping. However, the default
should probably be to detect that the message has been through SA previously,
and liminate the items that can corrupt the token gathering.
------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.