Julien Palard added the comment:

“digits which do not form decimal radix forms”

> “forming a form” seems a long way of saying very little. The difference seems 
> a bit vague

> I gather that digits not in the Unicode “decimal digit” category are often 
> (always?) still decimal digits

I expected them not to, but they often are representative of a base 10 value:

>>> import sys
>>> import unicodedata
>>> chars = ''.join(map(chr, range(sys.maxunicode+1)))
>>> decimals = ''.join(filter(str.isdecimal, chars))
>>> digits = ''.join(filter(str.isdigit, chars))
>>> non_decimal_digits = set(digits) - set(decimals)
>>> from collections import Counter
>>> Counter([unicodedata.digit(char) for char in non_decimal_digits])
Counter({1: 15, 2: 14, 3: 14, 4: 14, 5: 13, 6: 13, 7: 13, 8: 13, 9: 13, 0: 6})

But, note that there's one more in the range [1,4], it's the 
[Kharosthi](https://en.wikipedia.org/wiki/Kharosthi) numbers, they do not use 
base 10 but a notation reminiscent of Roman numerals.

So here, clearly, all digits are not an notation for a base 10 value.
 
> but primarily used for a symbolic or typographical meaning more than in a 
> plain number, e.g. superscripts, subscripts and other fonts, added circles 
> and other decorations.

Which also can't be used to form a base 10 number.

So here is another proposition for isdecimal, probably more human friendly:

    Return true if all characters in the string are decimal
    characters and there is at least one character, false
    otherwise. Decimal characters are those that can be used to form
    numbers in base 10, e.g. U+0660, ARABIC-INDIC DIGIT
    ZERO. Formally a decimal character is a character in the Unicode
    General Category "Nd".

And here is another proposition for isdigit, probably friendlier too:

    Return true if all characters in the string are digits and there is at 
least one
    character, false otherwise.  Digits include decimal characters and digits 
that need
    special handling, such as the compatibility superscript digits.
    This covers digits which cannot be used to form numbers in base 10, like 
the Kharosthi numbers.
    Formally, a digit is a character that has the property value
    Numeric_Type=Digit or Numeric_Type=Decimal.

----------

_______________________________________
Python tracker <rep...@bugs.python.org>
<http://bugs.python.org/issue26483>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe: 
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com

Reply via email to