Markus Kuhn <[EMAIL PROTECTED]>:
> [#4] The mbrtowc function returns the first of the following
> that applies (given the current conversion state):
>
> 0 if the next n or fewer bytes complete the
> multibyte character that corresponds to the
> null wide character (which is the value
> stored).
>
> positive if the next n or fewer bytes complete a
> valid multibyte character (which is the
> value stored); the value returned is the
> number of bytes that complete the multibyte
> character.
>
> (size_t)(-2) if the next n bytes contribute to an
> incomplete (but potentially valid) multibyte
> character, and all n bytes have been
> processed (no value is stored).291)
>
> (size_t)(-1) if an encoding error occurs, in which case
> the next n or fewer bytes do not contribute
> to a complete and valid multibyte character
> (no value is stored); the value of the macro
> EILSEQ is stored in errno, and the
> conversion state is unspecified.
>
>
> "if the next n bytes contribute"
>
> The philosophical question is, whether 0 bytes "contribute".
I think so, yes. Anyway, they definitely don't "complete", so you
can't use the standard to justify returning 0. I don't think you could
call it an encoding error, either, so you really don't have any
sensible alternative to returning (size_t)(-2).
Does Ulrich or anyone else have any objection to glibc's mbrtowc being
modified to return (size_t)(-2) instead of 0 when n is 0?
The only good reason I can think of is if there is some well
established "de facto" standard that carries more weight than the
official standard, in which case, fair enough, but please put a big
warning in the man page.
Edmund
-
Linux-UTF8: i18n of Linux on all levels
Archive: http://mail.nl.linux.org/linux-utf8/