Asif Iqbal wrote: > On Dec 7, 2007 10:13 PM, Matt Kettler <[EMAIL PROTECTED]> wrote: > >> Asif Iqbal wrote: >> >>> Hi All >>> >>> I took a message and learned it as ham like this >>> >>> cat email-with-headers | sa-learn --ham >>> >>> Now I should expect the exact same email to be considered as ham. Correct? >>> >>> >> No. You'd expect the bayes score to go down, but that alone might not be >> enough. >> > > Bayes score did go down > > >>> But it does not. When I pipe it through spamassassin like following it >>> shows it as spam >>> >>> cat email-with-headers | spamassassin -D >>> >> Care to show us the rule hits? >> > > X-Spam-Status: Yes, score=5.2 required=5.0 > tests=BAYES_00,MISSING_DATE, > MISSING_HB_SEP,MISSING_HEADERS,MISSING_MID,MISSING_SUBJECT,NO_RELAYS, > TVD_SPACE_RATIO autolearn=no version=3.2.3 > X-Spam-Report: > * 0.0 MISSING_MID Missing Message-Id: header > * 0.0 MISSING_DATE Missing Date: header > * -0.0 NO_RELAYS Informational: message was not relayed via SMTP > * 2.5 MISSING_HB_SEP Missing blank line between message header and > body > * 1.3 MISSING_HEADERS Missing To: header > * -2.6 BAYES_00 BODY: Bayesian spam probability is 0 to 1% > * [score: 0.0000] > * 2.2 TVD_SPACE_RATIO BODY: TVD_SPACE_RATIO > * 1.8 MISSING_SUBJECT Missing Subject: header > > However I do not understand why it says missing subject. The email > does have Subject: This is a subject >
The clue there is probably the "MISSING_HB_SEP".. There's probably some invalid in the headers prior to the Subject: that's causing SA to assume the body has started. Generally speaking this would be commonly caused by one of the following common errors: 1) a header name that has a space in it.That's invalid, and confuses the parser. 2) improper folding of a multi-line header (ie: the folded line doesn't start with a whitespace). Also invalid, and also confusing to the parser. > > > > >