> On 16 May 2018, at 18:13, Hans Åberg <haber...@telia.com> wrote: > > >> On 16 May 2018, at 17:14, Steffen Nurpmeso <stef...@sdaoden.eu> wrote: >> >> Joerg Schilling <joerg.schill...@fokus.fraunhofer.de> wrote: >> |Steffen Nurpmeso <stef...@sdaoden.eu> wrote: >> |>|> have some Unicode support. >> |>| >> |>|What do you expect: >> |>| >> |>| strtol("\u4e00\u4e8c\u4e09", &endp, 0); >> |> >> |> The entire is*() family cannot work with multibyte or stateful >> |> encodings, right. >> | >> |I asked a person who speaks japanese and he told me that >> | >> | "\u4e00\u4e8c\u4e09" >> | >> |is similar to >> | >> | "one two three" >> | >> |and this is not used for computing. >> >> If i recall correctly this has been discussed already; if not here >> then on the Unicode list. Unicode brings quite a lot of >> codepoints, like CIRCLED DIGIT ONE, PARENTHESIZED DIGIT ONE, DIGIT >> ONE FULL STOP etc. All these are marked "No", and i think the >> discussion concluded that they should not be taken into account >> when converting strings to numbers.
The intent may be that the value of the digit character c can be computed by the expression c - '0' when >= 0 and <= 9, and is otherwise a non-digit. Then 'isdigit' and [[:digit:]] are tied to that, so it is impossible to use any other decimal digits.