On Sat, Feb 1, 2014 at 3:50 PM, Eli Zaretskii <[email protected]> wrote: >> This depends on what files are out there with no encoding specified. >> Do you know how long makeinfo has output an encoding section? Is it >> still possible today that makeinfo could output a UTF-8 file with no >> encoding specified? > > It could be, with makeinfo 4.13, I think. > > In any case, UTF-8 covers ASCII, so I think it is safe these days. > What if a file is not in UTF-8 and doesn't specify its encoding? Is it likely, for example, that there are many files in ISO-8859-1 which don't specify their encoding?
>> We rely on iconv to find an exact conversion between characters. >> Failing that we look for an ASCII replacement. > > Yes, I understand, but why do we need a separate function for each > encoding? Isn't degrade_utf8 capable of covering all of them? > Good point, if a character is not represented for the target encoding, first convert it to UTF-8, then run degrade_utf8. I was thinking of doing all in one step. >> > One of the disadvantages of those degrade_* functions is that you must >> > match each encoding with a functions, and there are an awful lot of >> > possible encodings out there. >> > >> I've listed all the ones listed in the texinfo manual >> (http://www.gnu.org/software/texinfo/manual/texinfo/html_node/_0040documentencoding.html#g_t_0040documentencoding). >> Hopefully in the future more won't be added and everyone will use >> UTF-8 instead. > > I don't think we can forbid people from using their encoding, even if > it is not in that list. > I don't know what happens if someone puts something like "@documentencoding GB2312" in their Texinfo source. It doesn't matter anyway, because I'll use degrade_utf8 for them all as above.
