On 11/12/2017 07:46 PM, Eli Zaretskii wrote:
>> From: Tim Rühsen <[email protected]>
>> Date: Sun, 12 Nov 2017 14:50:47 +0100
>> Cc: YX Hao <[email protected]>
>>
>> As I understand, the second patch is still in discussion with Eli. Since I 
>> do 
>> not have Windows, I can't help you here. Though what I saw from the 
>> discussion, you address a portability issue that likely should be solved 
>> within gnulib. Maybe you could (in parallel) send a mail to 
>> [email protected] 
>> with a link to your discussion with Eli. There might be some people with 
>> deeper knowledge.
> 
> I don't think it's a Gnulib issue.  The problem is that on Windows,
> the implicit call at the beginning of Wget
> 
>   setlocale (LC_ALL, "C");

Why is there an explicit call with "C" ? There is an explicit call with "".
From the man page:
"If locale is an empty string, "", each part of the locale that should
be modified is set according to the environment variables."

> is not good enough to work in multibyte locales of the Far East,
> because the Windows runtime assumes a single-byte locale after that
> call.  And since Wget happens to need to display text and create files
> with non-ASCII characters, it gets hit more than other programs.

I (hopefully) can understand why this doesn't work. NTFS uses UTF-16 for
the filenames. If your environment specifies a single-character encoding
(e.g. C) and we use at some point a multi-character encoding (e.g.
utf-8), then any automatic conversion to UTF-16 filenames are likely to
fail. For me the question is: a) does wget has a bug (e.g. creating a
filename with a wrong encoded name string or b) does the Windows API has
a problem.

> The proposed solution is to add a special call to setlocale which gets
> this right on Windows.

Why can't we just convert the filename string into the correct encoding
and then create the file ? What do I miss ?

With Best Regards, Tim


Attachment: signature.asc
Description: OpenPGP digital signature

Reply via email to