http://bugzilla.spamassassin.org/show_bug.cgi?id=3163
------- Additional Comments From [EMAIL PROTECTED] 2004-03-12 15:10 ------- > I'm concerned that [A-Za-z] and \w are too locale-specific, so I'd like > to figure out exactly why they improve results so much over \S. HTML like this, where punctuation follows something in an anchor, is probably fairly common: <a href="mailto:[EMAIL PROTECTED]"><u>[EMAIL PROTECTED]</a></u>; so that might be why it reduces false positives (\S would match the semicolon). I guess the use of [A-Za-z] instead of \S would reduce the number of true positives in non-Roman messages, but most spam (that I receive) uses Roman characters so I'm not sure why you are surprised at the improvement it does get. Maybe I misunderstood the comment. The reduction in the number of hits on spam messages in the tables above is probably due to false positives in those spam messages that do not contain obfuscation. Perhaps the false positives in ham messages can be reduced further, I'm going to look for ham that the rules hit so they can be tweaked. ------- You are receiving this mail because: ------- You are the assignee for the bug, or are watching the assignee.
