Hi all, Another question: Short version: will spam report header lines from other spam filters confuse sa-learn?
Long version: I have a second email account that I have set to forward into my main email account. It turns out that this second email account has a spam-checker on it. I've included two representative headers from the spam filter below (with x'ed out IP addresses). X-Spam-Report: TrustedSender=yes, SenderIP=xxx.xxx.xxx.xxx X-Spam-PmxInfo: Server=avs-1, Version=4.7.0.111621, Antispam-Engine: 2.0.0.0, Antispam-Data: 2004.9.21.8, SenderIP=xxx.xxx.xxx.xxx X-Spam-Score: X-Spam-Report: IsSpam=no, Probability=7%, Hits=__HAS_MSGID 0, __SANE_MSGID 0 X-Spam-PmxInfo: Server=avs-6, Version=4.7.0.111621, Antispam-Engine: 2.0.0.0, Antispam-Data: 2004.9.21.8, SenderIP=xxx.xxx.xxx.xxx That's an example of ham coming through the server. Here's some headers from a spam message: X-Spam-Report: TrustedSender=yes, SenderIP=xxx.xxx.xxx.xxx X-Spam-PmxInfo: Server=avs-7, Version=4.7.0.111621, Antispam-Engine: 2.0.0.0, Antispam-Data: 2004.9.24.4, SenderIP=xxx.xxx.xxx.xxx X-Spam-Score: ******* X-Spam-Report: IsSpam=yes, Probability=99%, Hits=RELAY_IN_CBL 8, URI_CLASS_FINANCIAL_DOMAIN 8, OBFU_CLASS_FINANCIAL_MED 4, CTYPE_JUST_HTML 0.848, __CT 0, __CTE 0, __CTYPE_HTML 0, __CTYPE_IS_HTML 0, __HAS_MSGID 0, __MIME_HTML 0, __MIME_HTML_ONLY 0, __SANE_MSGID 0, __TAG_EXISTS_BODY 0, __TAG_EXISTS_HTML 0 X-Spam-PmxInfo: Server=avs-3, Version=4.7.0.111621, Antispam-Engine: 2.0.0.0, Antispam-Data: 2004.9.24.4, SenderIP=xxx.xxx.xxx.xxx If the email has already been tagged as spam by the account's filter, I see no reason that I should waste CPU cycles running spamassassin to check it. OTOH, I would like to run it through the bayesian learner. I've put the following lines into my .procmailrc above the invokation of spamassassin: :0 c : uwspam.lock * ^X-Spam-Score: \*\*\*\*\*\* | sa-learn --spam :0 A mail/spam-UW This way, I've got a copy ( just in case the filter blows it ), and I've run it through the filter. Will the X-Spam-Report: lines from the other filter confuse the Bayesian learner? The spamassassin documentation mentions that sa-learn will strip the markups from spamassassin, but I wouldn't expect it to strip these markings. Thanks, -Greg