http://bugzilla.spamassassin.org/show_bug.cgi?id=3147

           Summary: Bayes training should recognize SA message format and
                    unwrap the spam before processing
           Product: Spamassassin
           Version: SVN Trunk (Latest Devel Version)
          Platform: Other
        OS/Version: other
            Status: NEW
          Severity: normal
          Priority: P3
         Component: spamassassin
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


If you have SA set up to wrap all detected spams as attachments, you currently 
have to manually unwrap the attachment and hand it to SA for Bayes learning.  
This is an annoyance on a single message, but with the new ability to learn by 
peocessing an mbox it will be a real pain.

The Bayes processing, either single message or mbox, should recognise that a 
spam or ham has been wrapped as an attachment by a previous SA run, and only 
learn from the attachment.  

This needs to be automatic detection on a per-message basis, as some of the 
spam may have made it through and thus not be wrapped (and in fact have headers 
indicating it is NOT spam, which should also be removed), and some of the spam 
will have been detected by rules that overrode a Bayes ham score for the 
message.  In either case Bayes should be trained with the original message.

Obviously the same goes for learning ham, where very probably much stuff passed 
in will be stuff mis-detected as spam.

Perhaps there needs to be an option to enable, or more reasonably, disable this 
automatic detection and unwrapping/header stripping.  However, the default 
should probably be to detect that the message has been through SA previously, 
and liminate the items that can corrupt the token gathering.



------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to