On 11/12/2017 07:46 PM, Eli Zaretskii wrote: >> From: Tim Rühsen <[email protected]> >> Date: Sun, 12 Nov 2017 14:50:47 +0100 >> Cc: YX Hao <[email protected]> >> >> As I understand, the second patch is still in discussion with Eli. Since I >> do >> not have Windows, I can't help you here. Though what I saw from the >> discussion, you address a portability issue that likely should be solved >> within gnulib. Maybe you could (in parallel) send a mail to >> [email protected] >> with a link to your discussion with Eli. There might be some people with >> deeper knowledge. > > I don't think it's a Gnulib issue. The problem is that on Windows, > the implicit call at the beginning of Wget > > setlocale (LC_ALL, "C");
Why is there an explicit call with "C" ? There is an explicit call with "". From the man page: "If locale is an empty string, "", each part of the locale that should be modified is set according to the environment variables." > is not good enough to work in multibyte locales of the Far East, > because the Windows runtime assumes a single-byte locale after that > call. And since Wget happens to need to display text and create files > with non-ASCII characters, it gets hit more than other programs. I (hopefully) can understand why this doesn't work. NTFS uses UTF-16 for the filenames. If your environment specifies a single-character encoding (e.g. C) and we use at some point a multi-character encoding (e.g. utf-8), then any automatic conversion to UTF-16 filenames are likely to fail. For me the question is: a) does wget has a bug (e.g. creating a filename with a wrong encoded name string or b) does the Windows API has a problem. > The proposed solution is to add a special call to setlocale which gets > this right on Windows. Why can't we just convert the filename string into the correct encoding and then create the file ? What do I miss ? With Best Regards, Tim
signature.asc
Description: OpenPGP digital signature
