Ok, so previously it was wrong. And with this change it should be fine.

On Thursday 25 September 2025 04:12:44 Kirill Makurin wrote:
> IsDBCSLeadByte[Ex] functions take a BYTE argument which is unsigned char, so 
> there should be no sign-extension issues. _ismbblead should be affected by 
> sign-extension just like isleadbyte.
> 
> ```
> #include <stdio.h>
> 
> void as_byte (unsigned char c) {
>   printf ("%X\n", c);
> }
> 
> void as_uint (unsigned c) {
>   printf ("%X\n", c);
> }
> 
> int main (void) {
>   char c = (char) 0x80;
>   as_byte (c);
>   as_uint (c);
>   return 0;
> }
> ```
> 
> When compiled and run, it prints:
> 
> ```
> 80
> FFFFFF80
> ```
> ________________________________
> From: Pali Rohár <[email protected]>
> Sent: Thursday, September 25, 2025 5:49 AM
> To: [email protected] 
> <[email protected]>
> Cc: Martin Storsjö <[email protected]>; LIU Hao <[email protected]>; Kirill 
> Makurin <[email protected]>
> Subject: Re: [PATCH 1/2] crt: Replace _ismbblead() by IsDBCSLeadByte() in 
> crtexewin.c
> 
> Now I'm thinking about this change. Is not there sign-extend issue too?
> Is IsDBCSLeadByte taking as its argument signed char or unsigned char?
> 
> According to ms doc:
> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/ismbblead-ismbblead-l
> https://learn.microsoft.com/en-us/windows/win32/api/winnls/nf-winnls-isdbcsleadbyte
> 
> _ismbblead is taking unsigned int
> IsDBCSLeadByte is taking BYTE
> 
> I have feeling that explicit cast to (unsigned char) should have been
> used in both cases. But could it work without it?
> 
> On Wednesday 24 September 2025 18:10:30 Pali Rohár wrote:
> > crtexewin.c in non-UNICODE mode parses Windows command line string. This
> > string is stored in the ACP. There is no CRT function which guarantees that
> > is working in ACP and is checking if the character is a lead byte.
> >
> > CRT function isleadbyte() uses codepage from CRT's current locale which may
> > differs from ACP.
> >
> > CRT function _ismbblead() uses codepage set by the CRT function _setmbcp()
> > which also may differs from ACP (but by default should be ACP).
> >
> > So when parsing Windows command line arguments, use the WinAPI function
> > IsDBCSLeadByte() which always works according to ACP and hence should be
> > the right one for this purpose.
> > ---
> >  mingw-w64-crt/crt/crtexewin.c | 6 +-----
> >  1 file changed, 1 insertion(+), 5 deletions(-)
> >
> > diff --git a/mingw-w64-crt/crt/crtexewin.c b/mingw-w64-crt/crt/crtexewin.c
> > index af860f3e76da..52d12bd12029 100644
> > --- a/mingw-w64-crt/crt/crtexewin.c
> > +++ b/mingw-w64-crt/crt/crtexewin.c
> > @@ -7,10 +7,6 @@
> >  #include <tchar.h>
> >  #include <corecrt_startup.h>
> >
> > -#ifndef _UNICODE
> > -#include <mbctype.h>
> > -#endif
> > -
> >  #define SPACECHAR _T(' ')
> >  #define DQUOTECHAR _T('\"')
> >
> > @@ -40,7 +36,7 @@ int _tmain (int      __UNUSED_PARAM(argc),
> >            if (*lpCmdLine == DQUOTECHAR)
> >              inDoubleQuote = !inDoubleQuote;
> >  #ifndef _UNICODE
> > -          if (_ismbblead (*lpCmdLine))
> > +          if (IsDBCSLeadByte (*lpCmdLine))
> >              {
> >                if (lpCmdLine[1])
> >                  ++lpCmdLine;
> > --
> > 2.20.1
> >


_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to