Hi RW, thanks for your reply. >It's unlikely that that could push the BAYES RESULT down to BAYES_00 >unless there is uncorrected mistraining.
Possibly, but I suspect mistraining isn't a problem because apart from this specific type of spam, Spamassassin is doing (and has done for sometime) a very good job of correctly identifying mail properly. If I do a dump of the bayes database, we've got about 30k each of spam & ham that's it's learned from and based on those numbers I don't think the %age of mistrained messages would be significant at all if the odd few were mistrained. >I don't think the 3.2.x rules get updated much. Perhaps this is leading >to false autotraining in BAYES. Ah, perhaps this is more of a problem, I didn't realise there were different rule updates based on the versions of Spamassassin (well, not between 3.2.x and 3.3.x anyway). In that case, I'll try upgrading Spamassassin and see if that helps. Incidentally, I'm not sure the autotraining is much of a problem as it only seems to be very obvious (high scoring) spam (and ham) that triggers autotraining, according to the headers at least. Certainly none of this particular type of spam is getting autotrainined according to the headers. Finally, do you know if Spamassassin has rules that *should* catch this type of spam (ie. no legitimate email would include big blocks of random paragraphs inside HTML comments). I would have thought that of itself would have perhaps been picked up by a rule to identify it as spam. Thanks again, David. -- View this message in context: http://old.nabble.com/Text-contained-in-HTML-comments-causing-BAYES_00-to-classify-as-non-spam-tp29342874p29345981.html Sent from the SpamAssassin - Users mailing list archive at Nabble.com.