Oh, my bad. I was thinking I was calling MSVCRT's `btowc`, not the replacement.
I tried calling real `btowc` and its behavior matches UCRT. I also tried loading msvcr100.dll, it has the same behavior. So, I think we could check return value of `___lc_codepage_func()` and for "C" locale, return values in range [0,255] as is and return WEOF if value is outside this range. Should we change behavior so that replacement is used only when there is no real `btowc`? I do not fully understand how this mechanism works. Btw, glibc's `btowc` returns WEOF for characters in range [128,255] with "C" locale. The behavior of btowc not returning WEOF for values in range [128,255] seems to be new. Existing manpages do not mention it. - Kirill Makurin ________________________________ From: LIU Hao Sent: Tuesday, June 3, 2025 7:35 PM To: Kirill Makurin; mingw-w64-public@lists.sourceforge.net Subject: Re: [Mingw-w64-public] Inconsistent behavior of btowc with "C" locale 在 2025-6-3 18:02, Kirill Makurin 写道: > Ah, so UCRT does the right thing? I was not aware of this behavior for > "POSIX" locale. > > I still wonder whether MSVCRT's btowc succeeds when GetACP() != CP_UTF8. If > it converts using code page > retuned by GetACP(), this means that replacement has the same behavior. No. I meant the UCRT function should be standard. The POSIX locale seems to define that characters within [128,255] are zero-extended: https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap07.html#tag_07_03 The issue here is that we replace MSVCRT `btowc()` with a custom implementation in 'mingw-w64-crt/misc/btowc.c'. I think you may try calling the genuine MSVCRT one, which is returned by `GetProcAddress()`. -- Best regards, LIU Hao _______________________________________________ Mingw-w64-public mailing list Mingw-w64-public@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/mingw-w64-public