Marc-Andre Lemburg <[email protected]> added the comment:
The code point is also not listed as decimal digit (relevant for the int()
decimal parsing):
>>> unicodedata.decimal(unicode('三', 'utf-8'))
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
ValueError: not a decimal
This is the relevant part of the script:
for line in open(unihan):
if not line.startswith('U+'):
continue
code, tag, value = line.split(None, 3)[:3]
if tag not in ('kAccountingNumeric', 'kPrimaryNumeric',
'kOtherNumeric'):
continue
value = value.strip().replace(',', '')
i = int(code[2:], 16)
# Patch the numeric field
if table[i] is not None:
table[i][8] = value
The decimal column is not set for code points that have a kPrimaryNumeric value
set. Position table[i][8] refers to the
numeric database entry, which correctly gives:
>>> unicodedata.numeric(unicode('三', 'utf-8'))
3.0
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue10575>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com