[issue34723] lower() on Turkish letter "İ" returns a 2-chars-long string

Philippe Ombredanne Tue, 07 Jan 2020 12:35:31 -0800

Philippe Ombredanne <pombreda...@nexb.com> added the comment:

Thank for the (re) explanation. Unicode is tough!
Basically this is the issue i have really in the end with the folding: what 
used to be a proper alpha string is not longer one after a lower() because the 
second codepoint is a punctuation and I use a regex split on the \W word class 
that then behaves differently when the string is lowercased as we have an extra 
punctuation then to break on. I will find a way around these (rare) cases 
alright!


Sorry for the noise.

```
>>> 'İ'.isalpha()
True
>>> 'İ'.lower().isalpha()
False
```

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue34723>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

[issue34723] lower() on Turkish letter "İ" returns a 2-chars-long string

Reply via email to