Morten Lied Johansen <[EMAIL PROTECTED]> added the comment:

One issue that the current implementation has, which I can't see have 
been commented on here, is that it kills utf8 characters (and probably 
every other character encoding that is multi-byte).

A é character in an utf8 encoded string will be represented by two 
bytes. When passed through re.escape, those two bytes are checked 
individually, and both are considered non-alphanumeric, and is 
consequently escaped, breaking the utf8 string into complete gibberish 
instead.

----------
nosy: +mortenlj

_______________________________________
Python tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2650>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to