https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7022
--- Comment #18 from Ivo Truxa <[email protected]> --- (In reply to John Hardin from comment #5) > If this is done globally we'll lose the ability to detect some forms of > obfuscation. On the flip side, discarding the accents may have the effect of > making that obfuscation pointless. > > How does that balance out? Do we gain more from discarding all accents than > we lose from being able to tell whether or not accents are being used to > obfuscate a common word, which is a fairly strong spam sign? I come back again to this comment. As I wrote, it needs to be tested to see the reality, but in fact I am persuaded it can be only better. You need to ask yourself why do spammers obfuscate some words? Certainly not because they are hammy, but just because every anti-spam filter would immediately catch them. They obfuscate the most spammy words. So the fear that by removing the obfuscation you lose the advantage of a strong spam marker, is false. Quite in contrary - the original unobfuscated spam-word will become even much more spammy than before (thanks to many more hits), and will help to catch the spam easier. Only in the case that the obfuscated word transliterates in something else than the original spam word, the score of the original word could not be used (and increased), but in very most cases you would get a new nonsense-word, that would become as strong spam marker as its obfuscated version. Ivo -- You are receiving this mail because: You are the assignee for the bug.
