> From: Gavin Smith <[email protected]> > Date: Sat, 22 Oct 2022 21:36:20 +0100 > Cc: [email protected], [email protected] > > > In any case, both methods produce just a number, not an encoding > > identifier. We will need to prepend something to it, like "CP" maybe, > > depending on how the codeset is used by texi2any. > > Could you try the following?
This works, except that I used > + $locale_encoding = 'cp'.$CP; (lower-case "cp") instead of "CP", since that's what https://perldoc.perl.org/Encode seemed to imply. texinfo.info is now successfully built. The next problem is all the non-ASCII file names in tp/tests/ and its subdirectories: they cause failures like this one: make[2]: Entering directory `/d/gnu/texinfo-6.8.90/tp/tests' Making check in . make[3]: Entering directory `/d/gnu/texinfo-6.8.90/tp/tests' make input_file_names_recoded_stamp.txt make[4]: Entering directory `/d/gnu/texinfo-6.8.90/tp/tests' make[4]: *** No rule to make target `input/included_lat?rn1.texi', needed by `input_file_names_recoded_stamp.txt'. Stop. make[4]: Leaving directory `/d/gnu/texinfo-6.8.90/tp/tests' Makefile:3105: recipe for target `check-am' failed make[3]: *** [check-am] Error 2 That "?" there stands for a non-ASCII character in the file's name. These file names are encoded in UTF-8 in the tarball, and are referenced in UTF-8 encoding in the corresponding Makefile's. This causes trouble on Windows due to several issues, which are not really predictable up front: . Windows doesn't natively support UTF-8 encoded file names. . Various programs used for building and testing Texinfo may or may not support UTF-8 file names on Windows. The relevant programs include: - the 'tar' program used to unpack the tarball - the 'make' program used to run the tests - the Unixy shell used in the test scripts - various Coreutils programs used by the tests I use a native port of bsdtar (from libarchive) to unpack the tarball, which produces file names where UTF-8 byte sequences are literally present in the file system with each byte as a separate character (Windows attempts to interpret each byte as a character encoded in the system codepage). My Make and Bash are from MSYS, and they don't support UTF-8 in file names. Likewise with Coreutils. People who use tools from Cygwin or MSYS2 (which is a fork of Cygwin) might have the UTF-8 file names supported, but only if they consistently use all of the tools from those packages, and run on a recent enough version of Windows (because Cygwin and MSYS2 dropped support for Windows before 7). . Finally, the texi2any itself: if it uses a native Windows port of Perl (as it does in my case), it won't support UTF-8 file names and neither will it reliably accept UTF-8 encoded command-line arguments. So this is a mess on MS-Windows, and I really am at a loss how to solve it reliably. I don't want to lose the ability of running the texi2any test suite with the native MS-Windows port, so perhaps some way of skipping the files with non-ASCII file names could be added, either automatically on MS-Windows or by some Make variable (as in "make check NONASCII=no")? I presume that those files test capabilities that are not relevant for Windows anyway, since native Windows programs cannot reliably accept UTF-8 encoded command-line arguments. Thanks.
