Marc-Andre Lemburg <m...@egenix.com> added the comment: The code point is also not listed as decimal digit (relevant for the int() decimal parsing):
>>> unicodedata.decimal(unicode('三', 'utf-8')) Traceback (most recent call last): File "<stdin>", line 1, in <module> ValueError: not a decimal This is the relevant part of the script: for line in open(unihan): if not line.startswith('U+'): continue code, tag, value = line.split(None, 3)[:3] if tag not in ('kAccountingNumeric', 'kPrimaryNumeric', 'kOtherNumeric'): continue value = value.strip().replace(',', '') i = int(code[2:], 16) # Patch the numeric field if table[i] is not None: table[i][8] = value The decimal column is not set for code points that have a kPrimaryNumeric value set. Position table[i][8] refers to the numeric database entry, which correctly gives: >>> unicodedata.numeric(unicode('三', 'utf-8')) 3.0 ---------- _______________________________________ Python tracker <rep...@bugs.python.org> <http://bugs.python.org/issue10575> _______________________________________ _______________________________________________ Python-bugs-list mailing list Unsubscribe: http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com