On Thu, May 17, 2018 at 11:02:48AM +0200, Joerg Schilling wrote: > Hans Åberg <haber...@telia.com> wrote: > > > >> |I asked a person who speaks japanese and he told me that > > >> | > > >> | "\u4e00\u4e8c\u4e09" > > >> | > > >> |is similar to > > >> | > > >> | "one two three" > > >> | > > >> |and this is not used for computing. > > >> > > >> If i recall correctly this has been discussed already; if not here > > >> then on the Unicode list. Unicode brings quite a lot of > > >> codepoints, like CIRCLED DIGIT ONE, PARENTHESIZED DIGIT ONE, DIGIT > > >> ONE FULL STOP etc. All these are marked "No", and i think the > > >> discussion concluded that they should not be taken into account > > >> when converting strings to numbers. > > > > The intent may be that the value of the digit character c can be computed > > by the expression c - '0' when >= 0 and <= 9, and is otherwise a non-digit. > > Then 'isdigit' and [[:digit:]] are tied to that, so it is impossible to use > > any other decimal digits. > > This seems to be an important idea, as this japanese one two three > is not in a contiguous order.
Well, the digits in other scripts are ordered consequetively, so the calculation could easily be done, for the scripts I previously documented, as prescribed in ISO 14652. This is not rocket science. Best regards keld