https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7022
Kent Oyer <k...@mxguardian.net> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |k...@mxguardian.net --- Comment #20 from Kent Oyer <k...@mxguardian.net> --- Sorry for digging up an old thread but hopefully this helps someone. I have created an ASCII plugin that should alleviate this problem. The plugin is available here: https://github.com/mxguardian/Mail-SpamAssassin-Plugin-ASCII There are no external dependencies and it is very fast due to pre-compiling the rules. Existing rules continue to work as before. The plugin just adds a new rule type 'ascii' that matches against body text that has been converted to ASCII. I've tested it on a small corpus and found a 4% reduction in FN's with no change in FP's. A number that I think will increase as more rules are converted to ASCII rules. The problem with using something like Text::Unidecode is that it transliterates based on the meaning of the characters rather than appearance. Therefore I had to create my own character map. Feedback welcome. -- You are receiving this mail because: You are the assignee for the bug.