http://issues.apache.org/SpamAssassin/show_bug.cgi?id=5691


[EMAIL PROTECTED] changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
             Status|NEW                         |RESOLVED
         Resolution|                            |INVALID




------- Additional Comments From [EMAIL PROTECTED]  2007-10-24 15:01 -------
The proposed patch is incorrect and this bug is INVALID.

The purpose of the utf8::downgrade call is to get the speed benefits of having
the utf8 flag cleared when it is possible to represent the characters without
using the utf8 flag.  In this particular case, it is not possible to represent
the characters without having the utf8 flag set, so the call leaves the utf8
flag set as it is intended to do.

Many rules are not charset-normalization-aware and thus may perform poorly or
incorrectly with charset normalization enabled.  For example, I have seen rules
test for non-ASCII by using [\x80-\xff].  With charset normalization, they need
to instead use [^\x00-\x7f].  Similarly, rules might need to use [0-9] instead
of \d.  Similarly \s and \w might catch more characters than intended.




------- You are receiving this mail because: -------
You are the assignee for the bug, or are watching the assignee.

Reply via email to