https://issues.apache.org/SpamAssassin/show_bug.cgi?id=7022
--- Comment #17 from Ivo Truxa <[email protected]> --- In fact, I think it can be done transparent for the user or the rule developer, so that he does not need to bother. Just as it is now, I'd let the admin the choice to disable the normalizing altogether, enable the Unicode normalizing, or the ASCII normalizing. Then, SA, when processing rules would look whether the rule contains non-ASCII characters. If it does, it would let it match against the UTF8 or against the non-normalized version (depending on normalize_charset), otherwise with the ASCII normalized one. This would cover the vast majority of cases. Only in rather rare cases someone might like to run an ASCII regex on the non-ASCII version, and in such case a special tflag could be used. However, as I told already previously, I think the default setting should stay as it is - no normalizing, but both the UTF8 and the ASCII normalizing should be available to administrators who want to use them, regardless if there is any tflag for normalized/non-normalized versions available or not. Finally, if I am not mistaken, currently there is also no tflag for the Unicode normalizing, so if there are any rules written for UTF8, or for some specific code-pages, then they also do not always work correctly. -- You are receiving this mail because: You are the assignee for the bug.
