zheng <zheng0....@gmail.com> added the comment:

I propose we copy over the exact changes made to the Python 3 documentation.

I looked through the code mentioned in the other thread. Namely, 
`Objects/unicodeobject.c` and `Tools/unicode/makeunicodedata.py`. The 
implementation is identical between python 2 and python 3. The only difference 
appears to be the unicode version used.

    # decimal digit, integer digit
                decimal = 0
                if record[6]:
                    flags |= DECIMAL_MASK
                    decimal = int(record[6])
                digit = 0
                if record[7]:
                    flags |= DIGIT_MASK
                    digit = int(record[7])
                if record[8]:
                    flags |= NUMERIC_MASK
                    numeric.setdefault(record[8], []).append(char)

Another form of validation I did was enumerate all the digits and decimals and 
compare between versions. It looks like the general change is that there are a 
bunch of new unicode characters introduced in python 3. The exception is NEW 
TAI LUE THAM DIGIT ONE which gets recategorized as a digit.

python 2, compiled with UCS4
for u in map(unichr, list(range(0x10FFFF))):
    if unicode.isdigit(u):
        print(unicodedata.name(u))

python 3
for u in map(chr, range(0x10FFFF)):
    if str.isdigit(u):
        print(name(u))

----------
nosy: +zheng

_______________________________________
Python tracker <rep...@bugs.python.org>
<https://bugs.python.org/issue36417>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to