Philippe Ombredanne <pombreda...@nexb.com> added the comment: Thank for the (re) explanation. Unicode is tough! Basically this is the issue i have really in the end with the folding: what used to be a proper alpha string is not longer one after a lower() because the second codepoint is a punctuation and I use a regex split on the \W word class that then behaves differently when the string is lowercased as we have an extra punctuation then to break on. I will find a way around these (rare) cases alright!
Sorry for the noise. ``` >>> 'İ'.isalpha() True >>> 'İ'.lower().isalpha() False ``` ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue34723> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com