Oh, my bad. I was thinking I was calling MSVCRT's `btowc`, not the replacement.

I tried calling real `btowc` and its behavior matches UCRT. I also tried 
loading msvcr100.dll, it has the same behavior.

So, I think we could check return value of `___lc_codepage_func()` and for "C" 
locale, return values in range [0,255] as is and return WEOF if value is 
outside this range.

Should we change behavior so that replacement is used only when there is no 
real `btowc`? I do not fully understand how this mechanism works.

Btw, glibc's `btowc` returns WEOF for characters in range [128,255] with "C" 
locale. The behavior of btowc not returning WEOF for values in range [128,255] 
seems to be new. Existing manpages do not mention it.

- Kirill Makurin


________________________________
From: LIU Hao
Sent: Tuesday, June 3, 2025 7:35 PM
To: Kirill Makurin; mingw-w64-public@lists.sourceforge.net
Subject: Re: [Mingw-w64-public] Inconsistent behavior of btowc with "C" locale

在 2025-6-3 18:02, Kirill Makurin 写道:
> Ah, so UCRT does the right thing? I was not aware of this behavior for 
> "POSIX" locale.
>
> I still wonder whether MSVCRT's btowc succeeds when GetACP() != CP_UTF8. If 
> it converts using code page
> retuned by GetACP(), this means that replacement has the same behavior.

No. I meant the UCRT function should be standard. The POSIX locale seems to 
define that characters within
[128,255] are zero-extended:
    
https://pubs.opengroup.org/onlinepubs/9799919799/basedefs/V1_chap07.html#tag_07_03


The issue here is that we replace MSVCRT `btowc()` with a custom implementation 
in
'mingw-w64-crt/misc/btowc.c'. I think you may try calling the genuine MSVCRT 
one, which is returned by
`GetProcAddress()`.







--
Best regards,
LIU Hao

_______________________________________________
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to