Yes,MSVCRT doesn't recognize UTF-8.After debuging,I figure out that setlocale(LC_CTYPE, ".UTF8"); is equal to setlocale(LC_CTYPE, 0); in MSVCRT. And setlocale(LC_CTYPE, 0); also make it work in UCRT.So,instead of setlocale(LC_CTYPE, ".UTF8"); setlocale(LC_CTYPE, 0);` is the temporary solution in both cases.Although I don't why,but I it works anyway,I have test it with GBK,UTF8 and ISO8859-6.
Alvin Wong <[email protected]> 于2023年3月20日周一 21:22写道: > The thing is, the code point sequence you have here is not valid UTF-8 at > all. If it is indeed doing the conversion from UTF-8 you will most likely > get incorrect result or crashes. > > As you realized and reported in another reply that you were actually > testing with msvcrt. It is likely that msvcrt just ignored the unsupported > locale and was doing something unspecified. > On 20/3/2023 19:07, 傅继晗 wrote: > > However,I use GBK as default code page in my windows , and I try to test > it with GBK encoding content .But this trick seems still work.Here is the > test case. > > ---------------------------------------------------------------------------------------------------- > #include <stdio.h> > extern char * __cdecl basename (char *path); > void xprint(const char *s) > { > while (*s) > printf("\\x%02x", (int)(unsigned char)(*s++)); > } > > int main(int argc, char **argv) > { > char input[] > ={0x2f,0x73,0x64,0x63,0x61,0x72,0x64,0x2f,0xcc,0xec,0xcc,0xec,0xcf,0xf2,0xc9,0xcf,0x00}; > // it is gbk encoding of "/sdcard/天天向上" > char *output; > printf("basename(\""); > xprint(input); > printf("\") = \""); > output = basename(input); > xprint(output); > printf("\"\n"); > return 0; > } > > ---------------------------------------------------------------------------------------------------- > > > Alvin Wong <[email protected]> 于2023年3月20日周一 18:52写道: > >> Hi, >> >> Thanks for sending the patches. However my comment on these patches will >> be that, they only work when the process ANSI codepage (ACP) is UTF-8, >> which requires either embedding a manifest with activeCodePage set to UTF-8 >> or setting the system ACP to UTF-8. If the process is using CP936 (GBK) for >> example, it will still be broken similar to before. >> >> Just my two cents: I would prefer to remove any code that changes the >> locale then attempt to restore it (which is not thread-safe), then replace >> `mbstowcs` and `wcstombs` with direct usage of `MultiByteToWideChar` and >> `WideCharToMultiByte`, which can convert from/to CP_ACP directly. >> >> Best Regards, >> Alvin >> On 20/3/2023 18:36, 傅继晗 wrote: >> >> ok,it has txt extension now >> >> Alvin Wong <[email protected]> 于2023年3月20日周一 18:10写道: >> >>> Hi, if you attached a patch in your mail, it has been stripped by the >>> mailing list software. Please try renaming it to `.txt` and resend. >>> >>> On 20/3/2023 16:55, 傅继晗 wrote: >>> > Hello maintainers: >>> > >>> > According to microsoft page:setlocale, _wsetlocale | Microsoft Learn >>> > < >>> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170 >>> > >>> > >>> > *Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C >>> > Runtime supports using a UTF-8 code page. The change means that char >>> > strings passed to C runtime functions can expect strings in the UTF-8 >>> > encoding.* >>> > >>> > But the libmingwex.a in toolchain of Mingw-w64-public doesn't support >>> > non-ascii file name,and cause some bugs in project,see : >>> > MinGW-w64 - for 32 and 64 bit Windows / Bugs / #227 basename() >>> truncates >>> > filenames with variable-width encoding (sourceforge.net) >>> > <https://sourceforge.net/p/mingw-w64/bugs/227/> >>> > and AOSP adb pull push error >>> > Google Issue Tracker <https://issuetracker.google.com/issues/143232373 >>> > >>> > >>> > so,the patches for dirname.c and basename.c is needed to support utf-8 >>> > encoding. >>> > >>> > Greetings >>> > >>> > fjh1997 >>> > >>> > _______________________________________________ >>> > Mingw-w64-public mailing list >>> > [email protected] >>> > https://lists.sourceforge.net/lists/listinfo/mingw-w64-public >>> >>
From 7001e4b2635948cc6283f0534afa08457a342fd0 Mon Sep 17 00:00:00 2001 From: FunnyBiu <[email protected]> Date: Tue, 21 Mar 2023 16:26:56 +0800 Subject: [PATCH] Create dirname.c --- mingw-w64-crt/misc/dirname.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mingw-w64-crt/misc/dirname.c b/mingw-w64-crt/misc/dirname.c index 9c5cf87db..ce38a3190 100644 --- a/mingw-w64-crt/misc/dirname.c +++ b/mingw-w64-crt/misc/dirname.c @@ -40,7 +40,7 @@ dirname(char *path) if (locale != NULL) locale = strdup (locale); - setlocale (LC_CTYPE, ""); + setlocale (LC_CTYPE, 0); if (path && *path) {
From ebfbf0c5320a5d88aa04e78c3da8870f368305b9 Mon Sep 17 00:00:00 2001 From: FunnyBiu <[email protected]> Date: Tue, 21 Mar 2023 16:30:52 +0800 Subject: [PATCH] Update basename.c --- mingw-w64-crt/misc/basename.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/mingw-w64-crt/misc/basename.c b/mingw-w64-crt/misc/basename.c index c45dbbb36..a19137116 100644 --- a/mingw-w64-crt/misc/basename.c +++ b/mingw-w64-crt/misc/basename.c @@ -41,7 +41,7 @@ basename (char *path) if (locale != NULL) locale = strdup (locale); - setlocale (LC_CTYPE, ""); + setlocale (LC_CTYPE, 0); if (path && *path) {
_______________________________________________ Mingw-w64-public mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
