I don't think that any CRT header file would include windows.h, so I think it 
shouldn't be an issue.

I agree that 4th patch may be a little bit of overkill and results in 
incompatible usage of mbstate_t. This patch can be safely skipped if anyone 
thinks it's a bad idea.

Returning mbrtowc_cp is not hard. I think we may need to use IsDBCSLeadByte 
instead of isleadbyte once again. If we would call isleadbyte from mbrtowc_cp, 
there exists a little chance that some other thread may change global locale 
after we called ___lc_codepage_func or ___mb_cur_max_func but before 
isleadbyte, leading to incorrect results.
________________________________
From: Pali Rohár <[email protected]>
Sent: Wednesday, September 24, 2025 10:10 PM
To: Kirill Makurin <[email protected]>
Cc: mingw-w64-public <[email protected]>
Subject: Re: Correctly handle NULL arguments to mbrtowc

Hello, I have few comments:

PATCH 3:
I would prefer to put '#define WIN32_LEAN_AND_MEAN' at beginning of the
file before any other #include. This should prevent issues in future if
some other header file is included before that define and would
transitively include also windows.h.

PATCH 4:
About storing "unsigned short cp" into mbstate_t. This seems to be
incompatible change with msvcrt / UCRT. And I'm not sure if this is a
good idea. For example it can break mixing of mingw-w64 mbrtowc function
with any other msvcrt / UCRT function which is using mbstate_t.

mbrtowc_cp:
My original idea was to simplify code, remove unnecessary parts and do
not statically link into every executable unused code.
But I understand that for crtdll.dll the ___lc_codepage_func
implementation can be slow. I did not think about it originally.
If this is really a problem then we can revert it back, but ideally as
internal symbol prefixed e.g. by mingw term.

On Wednesday 24 September 2025 12:24:43 Kirill Makurin wrote:
> Current implementation of mbrtowc incorrectly handles situations when its 
> first or second argument is NULL. See POSIX specification[1].
>
> When first argument is NULL it must behave as usual, except that it produces 
> no output. Currently, it does not update conversion state in this case.
>
> When second argument is NULL, it must act as if was called as mbrtowc (NULL, 
> "", 1, state). Currently it simply puts `state` to initial conversion state.
>
> Fourth patch in this series makes detection of invalid conversion state 
> (optional POSIX feature) more robust. This may be a little overkill, but I 
> see no harm in doing so.
>
> Pali, remember there previously was mbrtowc_cp? I think we could return it. 
> crtdll.dll's ___lc_codepage_func seems to do the trick with strchr (setlocale 
> (LC_CTYPE, NULL), '.') and atoi to convert code page to int. I find it 
> unnecessary expensive to do on each call to mbrtowc from mbsrtowcs.
>
> - Kirill Makurin
>
> [1] https://pubs.opengroup.org/onlinepubs/9799919799/functions/mbrtowc.htm

_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to