Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-21 Thread 傅继晗
Thanks for the patient explanation. It's my mistake. Now I figure it out
that dirname() function works in  the "C" locale.and `setlocale(LC_CTYPE,
"")`  changes the locale to system default. so it causes the transaction.

LIU Hao  于2023年3月21日周二 17:39写道:

> 在 2023/3/21 17:15, 傅继晗 写道:
> > Yes,MSVCRT doesn't recognize UTF-8.After debuging,I figure out that
> > setlocale(LC_CTYPE,
> > ".UTF8"); is equal to setlocale(LC_CTYPE, 0); in MSVCRT. And
> > setlocale(LC_CTYPE,
> > 0); also make it work in UCRT.So,instead of setlocale(LC_CTYPE,
> > ".UTF8"); setlocale(LC_CTYPE,
> > 0);` is the temporary solution in both cases.Although I don't why,but I
> it
> > works anyway,I have test it with GBK,UTF8 and ISO8859-6.
> >
>
> What's the purpose of this patch?
>
> `setlocale(LC_CTYPE, "")` means to 'set' the locale in an
> implementation-defined way. Typically this
> means to obtain the current locale from your system, via configuration or
> environment variables.
> This is necessary because a program starts in the "C" locale.
>
> `setlocale(LC_CTYPE, 0)` is equivalent to `setlocale(LC_CTYPE, NULL)` and
> gets the name of the
> current locale, without actually modifying it. In your patch you discard
> its result so it does nothing.
>
>
> --
> Best regards,
> LIU Hao
>
>

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-21 Thread 傅继晗
No,`setlocale(LC_CTYPE, 0)` is not exactly same as `setlocale(LC_CTYPE,
NULL)` .  `setlocale(LC_CTYPE, 0)` returns a  zero point , and
`setlocale(LC_CTYPE, NULL)`  return a error string represents the
current  "C" locale. But they both indeed modifies the locale in a same
way. I have tested it many times and didn't find anything related in
the Microsoft documents.

LIU Hao  于2023年3月21日周二 17:39写道:

> 在 2023/3/21 17:15, 傅继晗 写道:
> > Yes,MSVCRT doesn't recognize UTF-8.After debuging,I figure out that
> > setlocale(LC_CTYPE,
> > ".UTF8"); is equal to setlocale(LC_CTYPE, 0); in MSVCRT. And
> > setlocale(LC_CTYPE,
> > 0); also make it work in UCRT.So,instead of setlocale(LC_CTYPE,
> > ".UTF8"); setlocale(LC_CTYPE,
> > 0);` is the temporary solution in both cases.Although I don't why,but I
> it
> > works anyway,I have test it with GBK,UTF8 and ISO8859-6.
> >
>
> What's the purpose of this patch?
>
> `setlocale(LC_CTYPE, "")` means to 'set' the locale in an
> implementation-defined way. Typically this
> means to obtain the current locale from your system, via configuration or
> environment variables.
> This is necessary because a program starts in the "C" locale.
>
> `setlocale(LC_CTYPE, 0)` is equivalent to `setlocale(LC_CTYPE, NULL)` and
> gets the name of the
> current locale, without actually modifying it. In your patch you discard
> its result so it does nothing.
>
>
> --
> Best regards,
> LIU Hao
>
>

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-21 Thread LIU Hao

在 2023/3/21 17:15, 傅继晗 写道:

Yes,MSVCRT doesn't recognize UTF-8.After debuging,I figure out that
setlocale(LC_CTYPE,
".UTF8"); is equal to setlocale(LC_CTYPE, 0); in MSVCRT. And
setlocale(LC_CTYPE,
0); also make it work in UCRT.So,instead of setlocale(LC_CTYPE,
".UTF8"); setlocale(LC_CTYPE,
0);` is the temporary solution in both cases.Although I don't why,but I it
works anyway,I have test it with GBK,UTF8 and ISO8859-6.



What's the purpose of this patch?

`setlocale(LC_CTYPE, "")` means to 'set' the locale in an implementation-defined way. Typically this 
means to obtain the current locale from your system, via configuration or environment variables. 
This is necessary because a program starts in the "C" locale.


`setlocale(LC_CTYPE, 0)` is equivalent to `setlocale(LC_CTYPE, NULL)` and gets the name of the 
current locale, without actually modifying it. In your patch you discard its result so it does nothing.



--
Best regards,
LIU Hao



OpenPGP_signature
Description: OpenPGP digital signature
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-21 Thread 傅继晗
Yes,MSVCRT doesn't recognize UTF-8.After debuging,I figure out that
setlocale(LC_CTYPE,
".UTF8"); is equal to setlocale(LC_CTYPE, 0); in MSVCRT. And
setlocale(LC_CTYPE,
0); also make it work in UCRT.So,instead of setlocale(LC_CTYPE,
".UTF8"); setlocale(LC_CTYPE,
0);` is the temporary solution in both cases.Although I don't why,but I it
works anyway,I have test it with GBK,UTF8 and ISO8859-6.

Alvin Wong  于2023年3月20日周一 21:22写道:

> The thing is, the code point sequence you have here is not valid UTF-8 at
> all. If it is indeed doing the conversion from UTF-8 you will most likely
> get incorrect result or crashes.
>
> As you realized and reported in another reply that you were actually
> testing with msvcrt. It is likely that msvcrt just ignored the unsupported
> locale and was doing something unspecified.
> On 20/3/2023 19:07, 傅继晗 wrote:
>
> However,I use GBK  as default code page in my windows , and I try to test
> it with GBK encoding content .But this trick seems still work.Here is the
> test case.
>
> 
> #include 
> extern char * __cdecl basename (char *path);
> void xprint(const char *s)
> {
> while (*s)
> printf("\\x%02x", (int)(unsigned char)(*s++));
> }
>
> int main(int argc, char **argv)
> {
> char input[]
> ={0x2f,0x73,0x64,0x63,0x61,0x72,0x64,0x2f,0xcc,0xec,0xcc,0xec,0xcf,0xf2,0xc9,0xcf,0x00};
> // it is gbk encoding of "/sdcard/天天向上"
> char *output;
> printf("basename(\"");
> xprint(input);
> printf("\") = \"");
> output = basename(input);
> xprint(output);
> printf("\"\n");
> return 0;
> }
>
> 
>
>
> Alvin Wong  于2023年3月20日周一 18:52写道:
>
>> Hi,
>>
>> Thanks for sending the patches. However my comment on these patches will
>> be that, they only work when the process ANSI codepage (ACP) is UTF-8,
>> which requires either embedding a manifest with activeCodePage set to UTF-8
>> or setting the system ACP to UTF-8. If the process is using CP936 (GBK) for
>> example, it will still be broken similar to before.
>>
>> Just my two cents: I would prefer to remove any code that changes the
>> locale then attempt to restore it (which is not thread-safe), then replace
>> `mbstowcs` and `wcstombs` with direct usage of `MultiByteToWideChar` and
>> `WideCharToMultiByte`, which can convert from/to CP_ACP directly.
>>
>> Best Regards,
>> Alvin
>> On 20/3/2023 18:36, 傅继晗 wrote:
>>
>> ok,it has txt extension now
>>
>> Alvin Wong  于2023年3月20日周一 18:10写道:
>>
>>> Hi, if you attached a patch in your mail, it has been stripped by the
>>> mailing list software. Please try renaming it to `.txt` and resend.
>>>
>>> On 20/3/2023 16:55, 傅继晗 wrote:
>>> > Hello maintainers:
>>> >
>>> > According to microsoft page:setlocale, _wsetlocale | Microsoft Learn
>>> > <
>>> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170
>>> >
>>> >
>>> > *Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
>>> > Runtime supports using a UTF-8 code page. The change means that char
>>> > strings passed to C runtime functions can expect strings in the UTF-8
>>> > encoding.*
>>> >
>>> > But the libmingwex.a in toolchain of Mingw-w64-public  doesn't support
>>> > non-ascii file name,and cause some bugs in  project,see :
>>> > MinGW-w64 - for 32 and 64 bit Windows / Bugs / #227 basename()
>>> truncates
>>> > filenames with variable-width encoding (sourceforge.net)
>>> > 
>>> > and AOSP adb pull push error
>>> > Google Issue Tracker >> >
>>> >
>>> > so,the patches for dirname.c and basename.c is needed to support utf-8
>>> > encoding.
>>> >
>>> > Greetings
>>> >
>>> > fjh1997
>>> >
>>> > ___
>>> > Mingw-w64-public mailing list
>>> > Mingw-w64-public@lists.sourceforge.net
>>> > https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
>>>
>>
From 7001e4b2635948cc6283f0534afa08457a342fd0 Mon Sep 17 00:00:00 2001
From: FunnyBiu <549308...@qq.com>
Date: Tue, 21 Mar 2023 16:26:56 +0800
Subject: [PATCH] Create dirname.c

---
 mingw-w64-crt/misc/dirname.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mingw-w64-crt/misc/dirname.c b/mingw-w64-crt/misc/dirname.c
index 9c5cf87db..ce38a3190 100644
--- a/mingw-w64-crt/misc/dirname.c
+++ b/mingw-w64-crt/misc/dirname.c
@@ -40,7 +40,7 @@ dirname(char *path)
 
   if (locale != NULL)
 locale = strdup (locale);
-  setlocale (LC_CTYPE, "");
+  setlocale (LC_CTYPE, 0);
 
   if (path && *path)
 {
From ebfbf0c5320a5d88aa04e78c3da8870f368305b9 Mon Sep 17 00:00:00 2001
From: FunnyBiu <549308...@qq.com>
Date: Tue, 21 Mar 2023 16:30:52 +0800
Subject: [PATCH] Update basename.c

---
 mingw-w64-crt/misc/basename.c | 2 +-
 1 file changed, 1 insertion(+), 1 

Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread Alvin Wong via Mingw-w64-public
The thing is, the code point sequence you have here is not valid UTF-8 
at all. If it is indeed doing the conversion from UTF-8 you will most 
likely get incorrect result or crashes.


As you realized and reported in another reply that you were actually 
testing with msvcrt. It is likely that msvcrt just ignored the 
unsupported locale and was doing something unspecified.


On 20/3/2023 19:07, 傅继晗 wrote:
However,I use GBK  as default code page in my windows , and I try to 
test it with GBK encoding content .But this trick seems still 
work.Here is the test case.


#include 
extern char * __cdecl basename (char *path);
void xprint(const char *s)
{
    while (*s)
        printf("\\x%02x", (int)(unsigned char)(*s++));
}

int main(int argc, char **argv)
{
    char input[] 
={0x2f,0x73,0x64,0x63,0x61,0x72,0x64,0x2f,0xcc,0xec,0xcc,0xec,0xcf,0xf2,0xc9,0xcf,0x00}; 
// it is gbk encoding of "/sdcard/天天向上"

    char *output;
    printf("basename(\"");
    xprint(input);
    printf("\") = \"");
    output = basename(input);
    xprint(output);
    printf("\"\n");
    return 0;
}



Alvin Wong  于2023年3月20日周一 18:52写道:

Hi,

Thanks for sending the patches. However my comment on these
patches will be that, they only work when the process ANSI
codepage (ACP) is UTF-8, which requires either embedding a
manifest with activeCodePage set to UTF-8 or setting the system
ACP to UTF-8. If the process is using CP936 (GBK) for example, it
will still be broken similar to before.

Just my two cents: I would prefer to remove any code that changes
the locale then attempt to restore it (which is not thread-safe),
then replace `mbstowcs` and `wcstombs` with direct usage of
`MultiByteToWideChar` and `WideCharToMultiByte`, which can convert
from/to CP_ACP directly.

Best Regards,
Alvin

On 20/3/2023 18:36, 傅继晗 wrote:

ok,it has txt extension now

Alvin Wong  于2023年3月20日周一 18:10写道:

Hi, if you attached a patch in your mail, it has been
stripped by the
mailing list software. Please try renaming it to `.txt` and
resend.

On 20/3/2023 16:55, 傅继晗 wrote:
> Hello maintainers:
>
> According to microsoft page:setlocale, _wsetlocale |
Microsoft Learn
>


>
> *Starting in Windows 10 version 1803 (10.0.17134.0), the
Universal C
> Runtime supports using a UTF-8 code page. The change means
that char
> strings passed to C runtime functions can expect strings in
the UTF-8
> encoding.*
>
> But the libmingwex.a in toolchain of Mingw-w64-public 
doesn't support
> non-ascii file name,and cause some bugs in project,see :
> MinGW-w64 - for 32 and 64 bit Windows / Bugs / #227
basename() truncates
> filenames with variable-width encoding (sourceforge.net
)
> 
> and AOSP adb pull push error
> Google Issue Tracker

>
> so,the patches for dirname.c and basename.c is needed to
support utf-8
> encoding.
>
> Greetings
>
> fjh1997
>
> ___
> Mingw-w64-public mailing list
> Mingw-w64-public@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread 傅继晗
I check it again,found that I actually use MSVCRT instead of UCRT. And so
strange that the MSVCRT is working but UCRT failed.

傅继晗  于2023年3月20日周一 19:25写道:

> No.I think msvcrt is outdated.So I use ucrt instead.
>
> LIU Hao  于2023年3月20日周一 19:05写道:
>
>> 在 2023/3/20 18:52, Alvin Wong via Mingw-w64-public 写道:
>> > Thanks for sending the patches. However my comment on these patches
>> will be that, they only work
>> > when the process ANSI codepage (ACP) is UTF-8, which requires either
>> embedding a manifest with
>> > activeCodePage set to UTF-8 or setting the system ACP to UTF-8. If the
>> process is using CP936 (GBK)
>> > for example, it will still be broken similar to before.
>>
>> Does this work for you with MSVCRT? As far as I recall MSVCRT doesn't
>> support UTF-8 at all, so you
>> will need UCRT at least. But anyway, there are other issues:
>>
>>
>> > Just my two cents: I would prefer to remove any code that changes the
>> locale then attempt to restore
>> > it (which is not thread-safe), then replace `mbstowcs` and `wcstombs`
>> with direct usage of
>> > `MultiByteToWideChar` and `WideCharToMultiByte`, which can convert
>> from/to CP_ACP directly.
>>
>> I totally agree with that.
>>
>>
>> --
>> Best regards,
>> LIU Hao
>>
>>

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread LIU Hao

在 2023/3/20 19:25, 傅继晗 写道:

No.I think msvcrt is outdated.So I use ucrt instead.


There are many distributions that have MSVCRT as the default or only CRT (e.g. Debian). And even 
with UCRT, I don't think it's valid to blindly assume that the argument is UTF-8 which is not the 
default.




--
Best regards,
LIU Hao



OpenPGP_signature
Description: OpenPGP digital signature
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread 傅继晗
No.I think msvcrt is outdated.So I use ucrt instead.

LIU Hao  于2023年3月20日周一 19:05写道:

> 在 2023/3/20 18:52, Alvin Wong via Mingw-w64-public 写道:
> > Thanks for sending the patches. However my comment on these patches will
> be that, they only work
> > when the process ANSI codepage (ACP) is UTF-8, which requires either
> embedding a manifest with
> > activeCodePage set to UTF-8 or setting the system ACP to UTF-8. If the
> process is using CP936 (GBK)
> > for example, it will still be broken similar to before.
>
> Does this work for you with MSVCRT? As far as I recall MSVCRT doesn't
> support UTF-8 at all, so you
> will need UCRT at least. But anyway, there are other issues:
>
>
> > Just my two cents: I would prefer to remove any code that changes the
> locale then attempt to restore
> > it (which is not thread-safe), then replace `mbstowcs` and `wcstombs`
> with direct usage of
> > `MultiByteToWideChar` and `WideCharToMultiByte`, which can convert
> from/to CP_ACP directly.
>
> I totally agree with that.
>
>
> --
> Best regards,
> LIU Hao
>
>

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread 傅继晗
However,I use GBK  as default code page in my windows , and I try to test
it with GBK encoding content .But this trick seems still work.Here is the
test case.

#include 
extern char * __cdecl basename (char *path);
void xprint(const char *s)
{
while (*s)
printf("\\x%02x", (int)(unsigned char)(*s++));
}

int main(int argc, char **argv)
{
char input[]
={0x2f,0x73,0x64,0x63,0x61,0x72,0x64,0x2f,0xcc,0xec,0xcc,0xec,0xcf,0xf2,0xc9,0xcf,0x00};
// it is gbk encoding of "/sdcard/天天向上"
char *output;
printf("basename(\"");
xprint(input);
printf("\") = \"");
output = basename(input);
xprint(output);
printf("\"\n");
return 0;
}



Alvin Wong  于2023年3月20日周一 18:52写道:

> Hi,
>
> Thanks for sending the patches. However my comment on these patches will
> be that, they only work when the process ANSI codepage (ACP) is UTF-8,
> which requires either embedding a manifest with activeCodePage set to UTF-8
> or setting the system ACP to UTF-8. If the process is using CP936 (GBK) for
> example, it will still be broken similar to before.
>
> Just my two cents: I would prefer to remove any code that changes the
> locale then attempt to restore it (which is not thread-safe), then replace
> `mbstowcs` and `wcstombs` with direct usage of `MultiByteToWideChar` and
> `WideCharToMultiByte`, which can convert from/to CP_ACP directly.
>
> Best Regards,
> Alvin
> On 20/3/2023 18:36, 傅继晗 wrote:
>
> ok,it has txt extension now
>
> Alvin Wong  于2023年3月20日周一 18:10写道:
>
>> Hi, if you attached a patch in your mail, it has been stripped by the
>> mailing list software. Please try renaming it to `.txt` and resend.
>>
>> On 20/3/2023 16:55, 傅继晗 wrote:
>> > Hello maintainers:
>> >
>> > According to microsoft page:setlocale, _wsetlocale | Microsoft Learn
>> > <
>> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170
>> >
>> >
>> > *Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
>> > Runtime supports using a UTF-8 code page. The change means that char
>> > strings passed to C runtime functions can expect strings in the UTF-8
>> > encoding.*
>> >
>> > But the libmingwex.a in toolchain of Mingw-w64-public  doesn't support
>> > non-ascii file name,and cause some bugs in  project,see :
>> > MinGW-w64 - for 32 and 64 bit Windows / Bugs / #227 basename() truncates
>> > filenames with variable-width encoding (sourceforge.net)
>> > 
>> > and AOSP adb pull push error
>> > Google Issue Tracker 
>> >
>> > so,the patches for dirname.c and basename.c is needed to support utf-8
>> > encoding.
>> >
>> > Greetings
>> >
>> > fjh1997
>> >
>> > ___
>> > Mingw-w64-public mailing list
>> > Mingw-w64-public@lists.sourceforge.net
>> > https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
>>
>

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread LIU Hao

在 2023/3/20 18:52, Alvin Wong via Mingw-w64-public 写道:
Thanks for sending the patches. However my comment on these patches will be that, they only work 
when the process ANSI codepage (ACP) is UTF-8, which requires either embedding a manifest with 
activeCodePage set to UTF-8 or setting the system ACP to UTF-8. If the process is using CP936 (GBK) 
for example, it will still be broken similar to before.


Does this work for you with MSVCRT? As far as I recall MSVCRT doesn't support UTF-8 at all, so you 
will need UCRT at least. But anyway, there are other issues:



Just my two cents: I would prefer to remove any code that changes the locale then attempt to restore 
it (which is not thread-safe), then replace `mbstowcs` and `wcstombs` with direct usage of 
`MultiByteToWideChar` and `WideCharToMultiByte`, which can convert from/to CP_ACP directly.


I totally agree with that.


--
Best regards,
LIU Hao



OpenPGP_signature
Description: OpenPGP digital signature
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread Alvin Wong via Mingw-w64-public

Hi,

Thanks for sending the patches. However my comment on these patches will 
be that, they only work when the process ANSI codepage (ACP) is UTF-8, 
which requires either embedding a manifest with activeCodePage set to 
UTF-8 or setting the system ACP to UTF-8. If the process is using CP936 
(GBK) for example, it will still be broken similar to before.


Just my two cents: I would prefer to remove any code that changes the 
locale then attempt to restore it (which is not thread-safe), then 
replace `mbstowcs` and `wcstombs` with direct usage of 
`MultiByteToWideChar` and `WideCharToMultiByte`, which can convert 
from/to CP_ACP directly.


Best Regards,
Alvin

On 20/3/2023 18:36, 傅继晗 wrote:

ok,it has txt extension now

Alvin Wong  于2023年3月20日周一 18:10写道:

Hi, if you attached a patch in your mail, it has been stripped by the
mailing list software. Please try renaming it to `.txt` and resend.

On 20/3/2023 16:55, 傅继晗 wrote:
> Hello maintainers:
>
> According to microsoft page:setlocale, _wsetlocale | Microsoft Learn
>


>
> *Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
> Runtime supports using a UTF-8 code page. The change means that char
> strings passed to C runtime functions can expect strings in the
UTF-8
> encoding.*
>
> But the libmingwex.a in toolchain of Mingw-w64-public doesn't
support
> non-ascii file name,and cause some bugs in  project,see :
> MinGW-w64 - for 32 and 64 bit Windows / Bugs / #227 basename()
truncates
> filenames with variable-width encoding (sourceforge.net
)
> 
> and AOSP adb pull push error
> Google Issue Tracker

>
> so,the patches for dirname.c and basename.c is needed to support
utf-8
> encoding.
>
> Greetings
>
> fjh1997
>
> ___
> Mingw-w64-public mailing list
> Mingw-w64-public@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread 傅继晗
ok,it has txt extension now

Alvin Wong  于2023年3月20日周一 18:10写道:

> Hi, if you attached a patch in your mail, it has been stripped by the
> mailing list software. Please try renaming it to `.txt` and resend.
>
> On 20/3/2023 16:55, 傅继晗 wrote:
> > Hello maintainers:
> >
> > According to microsoft page:setlocale, _wsetlocale | Microsoft Learn
> > <
> https://learn.microsoft.com/en-us/cpp/c-runtime-library/reference/setlocale-wsetlocale?view=msvc-170
> >
> >
> > *Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
> > Runtime supports using a UTF-8 code page. The change means that char
> > strings passed to C runtime functions can expect strings in the UTF-8
> > encoding.*
> >
> > But the libmingwex.a in toolchain of Mingw-w64-public  doesn't support
> > non-ascii file name,and cause some bugs in  project,see :
> > MinGW-w64 - for 32 and 64 bit Windows / Bugs / #227 basename() truncates
> > filenames with variable-width encoding (sourceforge.net)
> > 
> > and AOSP adb pull push error
> > Google Issue Tracker 
> >
> > so,the patches for dirname.c and basename.c is needed to support utf-8
> > encoding.
> >
> > Greetings
> >
> > fjh1997
> >
> > ___
> > Mingw-w64-public mailing list
> > Mingw-w64-public@lists.sourceforge.net
> > https://lists.sourceforge.net/lists/listinfo/mingw-w64-public
>
From 9334d5b8302b81a34630de6e6e940118b043d4e2 Mon Sep 17 00:00:00 2001
From: FunnyBiu <549308...@qq.com>
Date: Mon, 20 Mar 2023 16:50:52 +0800
Subject: [PATCH] Update dirname.c

---
 mingw-w64-crt/misc/dirname.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mingw-w64-crt/misc/dirname.c b/mingw-w64-crt/misc/dirname.c
index 9c5cf87db..ee950ce0b 100644
--- a/mingw-w64-crt/misc/dirname.c
+++ b/mingw-w64-crt/misc/dirname.c
@@ -40,7 +40,7 @@ dirname(char *path)
 
   if (locale != NULL)
 locale = strdup (locale);
-  setlocale (LC_CTYPE, "");
+  setlocale (LC_CTYPE, ".UTF8");
 
   if (path && *path)
 {
From 5bc55b91d51eae0ec92672a3cad86b4ee0fd3e7a Mon Sep 17 00:00:00 2001
From: FunnyBiu <549308...@qq.com>
Date: Mon, 20 Mar 2023 16:48:30 +0800
Subject: [PATCH] Update basename.c

---
 mingw-w64-crt/misc/basename.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mingw-w64-crt/misc/basename.c b/mingw-w64-crt/misc/basename.c
index c45dbbb36..9ae811cd9 100644
--- a/mingw-w64-crt/misc/basename.c
+++ b/mingw-w64-crt/misc/basename.c
@@ -41,7 +41,7 @@ basename (char *path)
 
   if (locale != NULL)
 locale = strdup (locale);
-  setlocale (LC_CTYPE, "");
+  setlocale (LC_CTYPE, ".UTF8");
 
   if (path && *path)
 {
___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public


Re: [Mingw-w64-public] [patch ]add UTF-8 support for dirname and basename

2023-03-20 Thread Alvin Wong via Mingw-w64-public
Hi, if you attached a patch in your mail, it has been stripped by the 
mailing list software. Please try renaming it to `.txt` and resend.


On 20/3/2023 16:55, 傅继晗 wrote:

Hello maintainers:

According to microsoft page:setlocale, _wsetlocale | Microsoft Learn


*Starting in Windows 10 version 1803 (10.0.17134.0), the Universal C
Runtime supports using a UTF-8 code page. The change means that char
strings passed to C runtime functions can expect strings in the UTF-8
encoding.*

But the libmingwex.a in toolchain of Mingw-w64-public  doesn't support
non-ascii file name,and cause some bugs in  project,see :
MinGW-w64 - for 32 and 64 bit Windows / Bugs / #227 basename() truncates
filenames with variable-width encoding (sourceforge.net)

and AOSP adb pull push error
Google Issue Tracker 

so,the patches for dirname.c and basename.c is needed to support utf-8
encoding.

Greetings

fjh1997

___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public



___
Mingw-w64-public mailing list
Mingw-w64-public@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/mingw-w64-public