> Date: Thu, 13 Aug 2015 19:10:41 +0200 > From: "Andries E. Brouwer" <[email protected]> > Cc: [email protected], "Andries E. Brouwer" <[email protected]> > > +/* Used to determine whether bytes 128-159 are OK in a filename */ > +static int > +have_utf8_locale() { > +#if defined(WINDOWS) || defined(MSDOS) || defined(__CYGWIN__) > + /* insert some test for Windows */ > +#else > + char *p; > + > + p = getenv("LC_ALL"); > + if (p == NULL) > + p = getenv("LC_CTYPE"); > + if (p == NULL) > + p = getenv("LANG"); > + if (strstr(p, "UTF-8") != NULL || strstr(p, "UTF8") != NULL || > + strstr(p, "utf-8") != NULL || strstr(p, "utf8") != NULL) > + return true; > +#endif > + return false; > +} > [...] > + opt.restrict_files_highctrl = (have_utf8_locale() ? false : true);
I'm not sure this is the right way to fix this. First, relying on UTF-8 locale to be announced in the environment is less portable than it could be: it's better to call 'setlocale' with the 2nd argument NULL to glean the same information. Then the ugly #ifdef above could be dropped, and at least Cygwin will not be excluded from this feature. Moreover, even if the locale is not UTF-8, wget should attempt to convert the file names to the current locale using iconv (which I believe was what Tim suggested). This will DTRT in much more cases than the above UTF-8 centric approach, IMO. Thanks.
