在 2025-6-26 16:15, Kirill Makurin 写道:
Hello,I was investigating wc*tomb* and mb*towc* functions in CRT and comparing their behavior to other implementations. Take the following example: ``` mbrtowc (NULL, s, 1, ps) mbrtowc (NULL, s + 1, 1, ps) ``` Here, `s` is a pointer to multibyte (DBCS) character, but since n==1 mbrtowc returns (size_t)-2 and updates ps. Next call completes converting multibyte character. What's the return value? CRT returns 2 while glibc returns 1. It seems to me that ISO C and POSIX specify different behavior for this case.
Please notice this line in POSIX-2024 [1]: [CX] [Option Start] The functionality described on this reference page is aligned with the ISO C standard. Any conflict between the requirements described here and the ISO C standard is unintentional. This volume of POSIX.1-2024 defers to the ISO C standard. [Option End]The only difference between ISO C, of the specification about the return value, is that ISO C says 'multibyte character' while POSIX says 'character'.
between 1 and n inclusive if the next n or fewer bytes complete a valid multibyte character (which is the value stored); the value returned is the number of bytes that complete the multibyte character.This reads to me like 'the number of bytes' is the number of bytes within 'the next n or fewer bytes'. The function will not return a value that (after being cast to `ptrdiff_t`) is greater than n.
[1] https://pubs.opengroup.org/onlinepubs/9799919799.2024edition/functions/mbrtowc.html -- Best regards, LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public