在 2025-6-1 17:23, Kirill Makurin 写道:
Hi,I have noticed that behavior of `btowc` function is inconsistent between MSVCRT and UCRT when current locale is "C". UCRT's `btowc` converts bytes in range 128-255 as if source charset was ISO-8859-1 (code page 28591). MSVCRT's (on Windows 11) fails and returns `WEOF`.
In the "C" and "POSIX" locale (they denote the same locale) this should be the standard behavior (https://pubs.opengroup.org/onlinepubs/9799919799/functions/btowc.html):
[CX] In the POSIX locale, btowc() shall not return WEOF if c has a value in the range 0 to 255 inclusive.
I have attached a simple program which can be used to reproduce it. It takes one argument which is locale string (you can use ".CodePage" for simplicity). I must mention that I have UTF-8 enabled globally, which means GetACP() == CP_UTF8. If anyone can test on system where this is not the case, this would be helpful. This made me wonder how mingw-w64 implements replacement for msvcr*.dll which do not have it: It passes return value of ` ___lc_codepage_func()` directly to `MultiByteToWideChar`.
Maybe it's necessary to check for "C" locale in there. -- Best regards, LIU Hao
OpenPGP_signature.asc
Description: OpenPGP digital signature
_______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public