On Sun, Feb 27, 2022 at 09:01:51AM +0200, Eli Zaretskii wrote: > > From: Gavin Smith <[email protected]> > > Date: Sat, 26 Feb 2022 22:23:04 +0000 > > > > Don't use the locale encoding by default for encoding filenames. > > I think this is a mistake, and at least on Windows we must use the > locale's encoding of file names by default (unless Perl has the > ability to support the entire Unicode range of characters in file > names on Windows -- does it?).
The plan indeed is, for windows, to use the locale. > As a data point: Emacs uses the locale's codeset as the default > file-name encoding for the last 15 years, on all supported systems, > and we have yet to hear about any significant problems with that. (On > MS-Windows, we switched to UTF-8 several years ago, but that required > to write replacements for every libc API that accepts file names, and > in that replacement to convert from UTF-8 to UTF-16, then call the > corresponding "wide" API that can accept wchar_t strings as file > names.) > > I think @documentencoding is only relevant for file names that come > from the Texinfo source, and it's only relevant for _decoding_ those > file names into the internal representation. When encoding them > before passing them to file-related APIs, those file names should be > encoded using the locale's encoding (by default). IOW, > @documentencoding just tells us how the file names are encoded in the > document, not how they are encoded in the filesystem. I agree with you, but Gavin scenario (explained in https://lists.gnu.org/archive/html/bug-texinfo/2022-02/msg00111.html is also possible). It would indeed be possible that people unpack tar files, maybe convmv on the main file, but leave the include file names as is. In that case, using @documentencoding is the best bet. In any case, this should be modifiable with a customization variable, so easily changed in the future if there are more informations on the most likely scenarios. And on windows, the plan is to set the customization variable such as locales are used. It would also be possible to use the @documentencoding for include files, but use the locale for file output. A bit similar to what we do for Texinfo manuals, for which we can have an output encoding different from the input encoding. -- Pat
