On Fri, Mar 01, 2019 at 02:25:29AM -0700, [email protected] wrote: > > Maybe a special marker character could be output that texindex treats > > specially: e.g. the above would be output as > > > > \entry{aa^_a}{1}{aa}{a} > > \entry{aa^_z}{3}{aa}{z} > > \entry{aah}{5}{aah} > > > > where ^_ is a 0x1F byte. > > Yes, perfect, that's real easy to handle in awk.
Unfortunately, writing ASCII control characters to files is not reliable and the results vary depending on the TeX installation and options. Likely "^^_" is written instead. See https://tex.stackexchange.com/questions/8729/write-non-printable-ascii-characters-to-a-file. The only reliable alternative seems to be to use a control sequence, thus: \entry{aa\subind a}{1}{aa}{a} \entry{aa\subind z}{3}{aa}{z} I assume awk can deal with that fine. It could be a "control symbol" instead, e.g.: \entry{aa\!a}{1}{aa}{a} \entry{aa\!z}{3}{aa}{z} But I don't think we should use a control symbol for this as it could clash with real or future Texinfo commands that might occur in an index entry. Whatever output is chosen, it would be good to check how existing versions of texindex deal with it: we should avoid completely breaking them if at all possible.
