Marc-Andre Lemburg <m...@egenix.com> added the comment:

The code point is also not listed as decimal digit (relevant for the int() 
decimal parsing):

>>> unicodedata.decimal(unicode('三', 'utf-8'))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
ValueError: not a decimal

This is the relevant part of the script:

        for line in open(unihan):
            if not line.startswith('U+'):
                continue
            code, tag, value = line.split(None, 3)[:3]
            if tag not in ('kAccountingNumeric', 'kPrimaryNumeric',
                           'kOtherNumeric'):
                continue
            value = value.strip().replace(',', '')
            i = int(code[2:], 16)
            # Patch the numeric field
            if table[i] is not None:
                table[i][8] = value

The decimal column is not set for code points that have a kPrimaryNumeric value 
set. Position table[i][8] refers to the
numeric database entry, which correctly gives:

>>> unicodedata.numeric(unicode('三', 'utf-8'))
3.0

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue10575>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to