What is a number having a numeric type of "digit" meant to convey?
The old Unicode 2.0 definition definition of "digit value" seemed clear: "Digit value. This is a numeric field. If the character represents a digit, not necessarily a decimal digit, the value is here. This covers digits which do not form decimal radix forms, such as the compatibility superscript digits. This field is informative." That definition seems to be gone. From what we now have, I can think of several meanings, e.g.: 1) It's a digit in a system of decimal place notation, but doesn't quite qualify for some reason. Typical examples are: a) U+19DA NEW TAI LUE THAM DIGIT ONE - cruelly denied "decimal" status because it wasn't assigned with 9 clones of the other Tai Lue digits. b) U+2070 SUPERSCRIPT ZERO - not in a contiguous range, and apparently possibility of misleading parsers c) U+2080 SUBSCRIPT ZERO - apparent possibility of misleading parsers 2) It's in a decimal system, but not with place notation: a) U+10E60 RUMI DIGIT ONE. By contrast U+10E69 RUMI NUMBER TEN is a mere "numeric" - possibly because numeric field values of blank for decimal digit value (field 6 in UnicodeData.txt), 1 as the digit value (field 7) and 10 as the value (field 8) would be too confusing, as well as contrary to the current rules. On the other hand, I don't see why, apart from a general disapproval of compatibility characters, the Roman numerals U+2170 SMALL ROMAN NUMERAL ONE to U+2178 SMALL ROMAN NUMERAL NINE don't count as digits. 3) It's derived from a decimal digit, e.g. U+2468 CIRCLED DIGIT NINE is "digit", whereas the next in the series, U+2469 CIRCLED NUMBER TEN, just has a numeric type of "numeric". ---- It's not clear to me why the following decimal digits (in the normal, not the Unicode sense) are not classified as "digit" but just as numeric U+1D360 COUNTING ROD UNIT DIGIT ONE U+3021 HANGZHOU NUMERAL ONE The only reason for U+1D369 COUNTING ROD TENS DIGIT ONE not to be a digit that I can think of is that the system is conceived of as a centesimal system. The counting rods 'UNIT' and 'TENS' digits are used alternatively to avoid misreading, with various methods for indicating zero. Likewise, why are U+0C79 TELUGU FRACTION DIGIT ONE FOR ODD POWERS OF FOUR and related characters not digits? Is it because they are a base 4 (or collectively hexadecimal) system? Perhaps some light can be shed on the system by learning what people actually use the numeric types and (decimal) digit values for. Richard.

