在 2025-6-1 17:23, Kirill Makurin 写道:
Hi,

I have noticed that behavior of `btowc` function is inconsistent between MSVCRT and UCRT 
when current locale is "C".

UCRT's `btowc` converts bytes in range 128-255 as if source charset was 
ISO-8859-1 (code page 28591). MSVCRT's (on Windows 11) fails and returns `WEOF`.

In the "C" and "POSIX" locale (they denote the same locale) this should be the standard behavior (https://pubs.opengroup.org/onlinepubs/9799919799/functions/btowc.html):

   [CX] In the POSIX locale, btowc() shall not return WEOF if c has a value in 
the range
        0 to 255 inclusive.


I have attached a simple program which can be used to reproduce it. It takes one argument 
which is locale string (you can use ".CodePage" for simplicity).

I must mention that I have UTF-8 enabled globally, which means GetACP() == 
CP_UTF8. If anyone can test on system where this is not the case, this would be 
helpful.

This made me wonder how mingw-w64 implements replacement for msvcr*.dll which 
do not have it: It passes return value of `
___lc_codepage_func()` directly to `MultiByteToWideChar`.

Maybe it's necessary to check for "C" locale in there.


--
Best regards,
LIU Hao

Attachment: OpenPGP_signature.asc
Description: OpenPGP digital signature

_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to