[issue27496] unicodedata.name() doesn't have names for control characters

2021-03-08 Thread STINNER Victor
Change by STINNER Victor : -- nosy: -vstinner ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe:

[issue27496] unicodedata.name() doesn't have names for control characters

2021-02-26 Thread Eryk Sun
Change by Eryk Sun : -- versions: +Python 3.10, Python 3.8, Python 3.9 -Python 2.7, Python 3.5, Python 3.6 ___ Python tracker ___

[issue27496] unicodedata.name() doesn't have names for control characters

2016-07-12 Thread Eryk Sun
Changes by Eryk Sun : -- versions: +Python 2.7, Python 3.6 ___ Python tracker ___

[issue27496] unicodedata.name() doesn't have names for control characters

2016-07-12 Thread Eryk Sun
Changes by Eryk Sun : -- components: +Unicode nosy: +ezio.melotti, haypo ___ Python tracker ___

[issue27496] unicodedata.name() doesn't have names for control characters

2016-07-12 Thread Zack Weinberg
Zack Weinberg added the comment: It looks to me as if NameAliases.txt is the better reference for the C0 and C1 controls. It matches the UnicodeData.txt field 10 names for most entries where the field 1 name is "", but it has names for U+0080, U+0081, U+0084, and U+0099, which have no field

[issue27496] unicodedata.name() doesn't have names for control characters

2016-07-12 Thread Eryk Sun
Eryk Sun added the comment: Character names are in field 1 of UnicodeData.txt [1][2]. For controls the name is just "". In Tools/unicode/makunicodedata.py, the makeunicodename function skips names that start with "<". Instead of skipping the character, it could fall back on the Unicode 1.0

[issue27496] unicodedata.name() doesn't have names for control characters

2016-07-12 Thread R. David Murray
R. David Murray added the comment: That information is programatically generated from data files obtained from the unicode project, as far as I know. -- nosy: +r.david.murray ___ Python tracker

[issue27496] unicodedata.name() doesn't have names for control characters

2016-07-12 Thread Zack Weinberg
New submission from Zack Weinberg: unicodedata.name() doesn't have name information for the C0 and C1 control characters. To see this, run pprint.pprint(["U+{:04X} {}".format(n, unicodedata.name(chr(n), "")) for n in range(256)]) and you will observe printed for U+ through U+001F and