> On 17 May 2018, at 11:02, Joerg Schilling 
> <joerg.schill...@fokus.fraunhofer.de> wrote:
> 
> Hans Åberg <haber...@telia.com> wrote:
> 
>>>> |I asked a person who speaks japanese and he told me that
>>>> |
>>>> | "\u4e00\u4e8c\u4e09"
>>>> |
>>>> |is similar to
>>>> |
>>>> | "one two three"
>>>> |
>>>> |and this is not used for computing.
>>>> 
>>>> If i recall correctly this has been discussed already; if not here
>>>> then on the Unicode list.  Unicode brings quite a lot of
>>>> codepoints, like CIRCLED DIGIT ONE, PARENTHESIZED DIGIT ONE, DIGIT
>>>> ONE FULL STOP etc.  All these are marked "No", and i think the
>>>> discussion concluded that they should not be taken into account
>>>> when converting strings to numbers.
>> 
>> The intent may be that the value of the digit character c can be computed by 
>> the expression c - '0' when >= 0 and <= 9, and is otherwise a non-digit. 
>> Then 'isdigit' and [[:digit:]] are tied to that, so it is impossible to use 
>> any other decimal digits.
> 
> This seems to be an important idea, as this japanese one two three
> is not in a contiguous order.

It provides an efficient implementation, important on earlier computers. The 
UTF-8 article [1], "History", mentions that they struggled around 1992 to find 
proposals for that providing efficient implementations.

1. https://en.wikipedia.org/wiki/UTF-8



Reply via email to