Steven D'Aprano <steve+pyt...@pearwood.info> added the comment: In addition, you are probably hitting normalization issues. There are two ways to get the Cyrillic character 'й' in your string, one of them is a single code point, the other is two code points:
>>> a = 'й' >>> b = 'й' >>> len(a), unicodedata.name(a) (1, 'CYRILLIC SMALL LETTER SHORT I') >>> len(b), unicodedata.name(b[0]), unicodedata.name(b[1]) (2, 'CYRILLIC SMALL LETTER I', 'COMBINING BREVE') ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <https://bugs.python.org/issue42614> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com