Yes,MSVCRT doesn't recognize UTF-8.After debuging,I figure out that
setlocale(LC_CTYPE,
".UTF8"); is equal to setlocale(LC_CTYPE, 0); in MSVCRT. And
setlocale(LC_CTYPE,
0); also make it work in UCRT.So,instead of setlocale(LC_CTYPE,
".UTF8"); setlocale(LC_CTYPE,
0);` is the temporary solution in both cases.Although I don't why,but I it
works anyway,I have test it with GBK,UTF8 and ISO8859-6.

Alvin Wong <[email protected]> 于2023年3月20日周一 21:22写道:

> The thing is, the code point sequence you have here is not valid UTF-8 at
> all. If it is indeed doing the conversion from UTF-8 you will most likely
> get incorrect result or crashes.
>
> As you realized and reported in another reply that you were actually
> testing with msvcrt. It is likely that msvcrt just ignored the unsupported
> locale and was doing something unspecified.
> On 20/3/2023 19:07, 傅继晗 wrote:
>
> However,I use GBK  as default code page in my windows , and I try to test
> it with GBK encoding content .But this trick seems still work.Here is the
> test case.
>
> ----------------------------------------------------------------------------------------------------
> #include <stdio.h>
> extern char * __cdecl basename (char *path);
> void xprint(const char *s)
> {
>     while (*s)
>         printf("\\x%02x", (int)(unsigned char)(*s++));
> }
>
> int main(int argc, char **argv)
> {
>     char input[]
> ={0x2f,0x73,0x64,0x63,0x61,0x72,0x64,0x2f,0xcc,0xec,0xcc,0xec,0xcf,0xf2,0xc9,0xcf,0x00};
> // it is gbk encoding of "/sdcard/天天向上"
>     char *output;
>     printf("basename(\"");
>     xprint(input);
>     printf("\") = \"");
>     output = basename(input);
>     xprint(output);
>     printf("\"\n");
>     return 0;
> }
>
> ----------------------------------------------------------------------------------------------------
>
>
> Alvin Wong <[email protected]> 于2023年3月20日周一 18:52写道:
>
>> Hi,
>>
>> Thanks for sending the patches. However my comment on these patches will
>> be that, they only work when the process ANSI codepage (ACP) is UTF-8,
>> which requires either embedding a manifest with activeCodePage set to UTF-8
>> or setting the system ACP to UTF-8. If the process is using CP936 (GBK) for
>> example, it will still be broken similar to before.
>>
>> Just my two cents: I would prefer to remove any code that changes the
>> locale then attempt to restore it (which is not thread-safe), then replace
>> `mbstowcs` and `wcstombs` with direct usage of `MultiByteToWideChar` and
>> `WideCharToMultiByte`, which can convert from/to CP_ACP directly.
>>
>> Best Regards,
>> Alvin
>> On 20/3/2023 18:36, 傅继晗 wrote:
>>
>> ok,it has txt extension now
>>
>> Alvin Wong <[email protected]> 于2023年3月20日周一 18:10写道:
>>
>>> Hi, if you attached a patch in your mail, it has been stripped by the
>>> mailing list software. Please try renaming it to `.txt` and resend.
>>>
>>> On 20/3/2023 16:55, 傅继晗 wrote:
>>> > Hello maintainers:
>>> >
>>> > According to microsoft page:setlocale, _wsetlocale | Microsoft Learn
>>> > <
>>> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170
>>> >
>>> >
>>> > *Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
>>> > Runtime supports using a UTF-8 code page. The change means that char
>>> > strings passed to C runtime functions can expect strings in the UTF-8
>>> > encoding.*
>>> >
>>> > But the libmingwex.a in toolchain of Mingw-w64-public  doesn't support
>>> > non-ascii file name,and cause some bugs in  project,see :
>>> > MinGW-w64 - for 32 and 64 bit Windows / Bugs / #227 basename()
>>> truncates
>>> > filenames with variable-width encoding (sourceforge.net)
>>> > <https://sourceforge.net/p/mingw-w64/bugs/227/>
>>> > and AOSP adb pull push error
>>> > Google Issue Tracker <https://issuetracker.google.com/issues/143232373
>>> >
>>> >
>>> > so,the patches for dirname.c and basename.c is needed to support utf-8
>>> > encoding.
>>> >
>>> > Greetings
>>> >
>>> > fjh1997
>>> >
>>> > _______________________________________________
>>> > Mingw-w64-public mailing list
>>> > [email protected]
>>> > https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
>>>
>>
From 7001e4b2635948cc6283f0534afa08457a342fd0 Mon Sep 17 00:00:00 2001
From: FunnyBiu <[email protected]>
Date: Tue, 21 Mar 2023 16:26:56 +0800
Subject: [PATCH] Create dirname.c

---
 mingw-w64-crt/misc/dirname.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mingw-w64-crt/misc/dirname.c b/mingw-w64-crt/misc/dirname.c
index 9c5cf87db..ce38a3190 100644
--- a/mingw-w64-crt/misc/dirname.c
+++ b/mingw-w64-crt/misc/dirname.c
@@ -40,7 +40,7 @@ dirname(char *path)
 
   if (locale != NULL)
     locale = strdup (locale);
-  setlocale (LC_CTYPE, "");
+  setlocale (LC_CTYPE, 0);
 
   if (path && *path)
     {
From ebfbf0c5320a5d88aa04e78c3da8870f368305b9 Mon Sep 17 00:00:00 2001
From: FunnyBiu <[email protected]>
Date: Tue, 21 Mar 2023 16:30:52 +0800
Subject: [PATCH] Update basename.c

---
 mingw-w64-crt/misc/basename.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mingw-w64-crt/misc/basename.c b/mingw-w64-crt/misc/basename.c
index c45dbbb36..a19137116 100644
--- a/mingw-w64-crt/misc/basename.c
+++ b/mingw-w64-crt/misc/basename.c
@@ -41,7 +41,7 @@ basename (char *path)
 
   if (locale != NULL)
     locale = strdup (locale);
-  setlocale (LC_CTYPE, "");
+  setlocale (LC_CTYPE, 0);
 
   if (path && *path)
     {
_______________________________________________
Mingw-w64-public mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public

Reply via email to