On Fri, Dec 26, 2014 at 9:52 PM, Eli Zaretskii <e...@gnu.org> wrote: > It's a broken file. I have no idea how they produced it, but it > wasn't by stock makeinfo 4.8 on Windows, because that version already > did both count byte offsets in makeinfo disregarding the CR > characters, and had the EOL conversion function in the Info reader. I > just checked its code, which I still have on my disk. >
I couldn't quickly find the code in C makeinfo for this - is it something to do with file modes under Windows? You are probably right that it wasn't produced by makeinfo under Windows, but I did reproduce something similar when running makeinfo 4.13 under GNU/Linux with a Texinfo source file with CR-LF line endings. See the attached input and output files. The whitespace in the output Info file doesn't make a lot of sense, but the point is that the preamble of the info file does contain a line with a CR-LF ending, but the tag table doesn't take this into account - the node separator is at byte 113 of the file exactly. It's possible that this file was produced in a similar way. There may be similar results if a file has mixed kinds of line endings (or if it includes other files with different line endings). We can't exactly say that the tag tables in files like these is "incorrect". Same goes for files produced under Windows where the CR bytes aren't counted. We're just left with the problem of loading the files that are out there properly. > Its tag table accounts for the CR characters, which is wrong. That's > why the Info reader from 4.13 cannot read it correctly. And that's > exactly what will happen with Info files created by makeinfo 5.2 when > someone tries to read them with Info from 4.13. > > Moreover, the same problem will happen with the Emacs Info reader. > Emacs removes the CR characters when it reads files into buffers (any > files, not just Info files), so it must have the tag table with > offsets that disregard the CRs. If it turns out there are files out there where the 1000-byte slack in looking for a node isn't enough, we could tweak it, maybe by increasing the slack as we get later on in the file. Maybe something similar could be done in Emacs Info. If we could stop makeinfo producing files with CR bytes it would stop this problem for newly produced files.
cr-lf-endings-4.texi
Description: TeXInfo document
cr-lf-endings-4.info
Description: Binary data