Ok, so previously it was wrong. And with this change it should be fine.
On Thursday 25 September 2025 04:12:44 Kirill Makurin wrote:
> IsDBCSLeadByte[Ex] functions take a BYTE argument which is unsigned char, so
> there should be no sign-extension issues. _ismbblead should be affected by
> sign-extension just like isleadbyte.
>
> ```
> #include <stdio.h>
>
> void as_byte (unsigned char c) {
> printf ("%X\n", c);
> }
>
> void as_uint (unsigned c) {
> printf ("%X\n", c);
> }
>
> int main (void) {
> char c = (char) 0x80;
> as_byte (c);
> as_uint (c);
> return 0;
> }
> ```
>
> When compiled and run, it prints:
>
> ```
> 80
> FFFFFF80
> ```
> ________________________________
> From: Pali Rohár <[email protected]>
> Sent: Thursday, September 25, 2025 5:49 AM
> To: [email protected]
> <[email protected]>
> Cc: Martin Storsjö <[email protected]>; LIU Hao <[email protected]>; Kirill
> Makurin <[email protected]>
> Subject: Re: [PATCH 1/2] crt: Replace _ismbblead() by IsDBCSLeadByte() in
> crtexewin.c
>
> Now I'm thinking about this change. Is not there sign-extend issue too?
> Is IsDBCSLeadByte taking as its argument signed char or unsigned char?
>
> According to ms doc:
> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/ismbblead-ismbblead-l
> https://learn.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-isdbcsleadbyte
>
> _ismbblead is taking unsigned int
> IsDBCSLeadByte is taking BYTE
>
> I have feeling that explicit cast to (unsigned char) should have been
> used in both cases. But could it work without it?
>
> On Wednesday 24 September 2025 18:10:30 Pali Rohár wrote:
> > crtexewin.c in non-UNICODE mode parses Windows command line string. This
> > string is stored in the ACP. There is no CRT function which guarantees that
> > is working in ACP and is checking if the character is a lead byte.
> >
> > CRT function isleadbyte() uses codepage from CRT's current locale which may
> > differs from ACP.
> >
> > CRT function _ismbblead() uses codepage set by the CRT function _setmbcp()
> > which also may differs from ACP (but by default should be ACP).
> >
> > So when parsing Windows command line arguments, use the WinAPI function
> > IsDBCSLeadByte() which always works according to ACP and hence should be
> > the right one for this purpose.
> > ---
> > mingw-w64-crt/crt/crtexewin.c | 6 +-----
> > 1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/mingw-w64-crt/crt/crtexewin.c b/mingw-w64-crt/crt/crtexewin.c
> > index af860f3e76da..52d12bd12029 100644
> > --- a/mingw-w64-crt/crt/crtexewin.c
> > +++ b/mingw-w64-crt/crt/crtexewin.c
> > @@ -7,10 +7,6 @@
> > #include <tchar.h>
> > #include <corecrt_startup.h>
> >
> > -#ifndef _UNICODE
> > -#include <mbctype.h>
> > -#endif
> > -
> > #define SPACECHAR _T(' ')
> > #define DQUOTECHAR _T('\"')
> >
> > @@ -40,7 +36,7 @@ int _tmain (int __UNUSED_PARAM(argc),
> > if (*lpCmdLine == DQUOTECHAR)
> > inDoubleQuote = !inDoubleQuote;
> > #ifndef _UNICODE
> > - if (_ismbblead (*lpCmdLine))
> > + if (IsDBCSLeadByte (*lpCmdLine))
> > {
> > if (lpCmdLine[1])
> > ++lpCmdLine;
> > --
> > 2.20.1
> >
_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public