On 15 January 2016 at 17:15, Masamichi HOSODA <[email protected]> wrote:
>> I think it could be done by changing the active definitions of bytes
>> 128-256 when writing to an auxiliary file to read a single Unicode
>> character and write out an ASCII sequence that represents that
>> character, probably involving the @U command. Do you know how to do
>> this?
>
> If I understand correctly, active definitions is unrelated.
It is not unrelated.
> On the other hand, in the case of "bytes" encoding,
>
> XeTeX reads as following:
> letter -> ".tex" -> inner XeTeX
> F -> 0x66 -> U+0066
> ü -> 0xC3, 0xBC -> U+00C3, U+00BC
> r -> 0x72 -> U+0072
>
> XeTeX writes ".toc" files in UTF-8 *always*.
> It cannot change without something like \XeTeXoutputencoding primitive:
> letter -> ".tex" -> inner XeTeX -> ".toc"
> F -> 0x66 -> U+0066 -> 0x66
> ü -> 0xC3, 0xBC -> U+00C3, U+00BC -> 0xC3, 0x83, 0xC2, 0xBC
> r -> 0x72 -> U+0072 -> 0x72
>
> As a result, ".tex" and ".toc" are different.
> Moreover, ".toc" is broken. It cannot be repaired.
Your understanding here is correct but is beside the point. I'll
demonstrate with a small change:
Index: texinfo.tex
===================================================================
--- texinfo.tex (revision 6933)
+++ texinfo.tex (working copy)
@@ -4587,7 +4587,7 @@
% We want to disable all macros so that they are not expanded by \write.
\macrolist
%
- \normalturnoffactive
+ %\normalturnoffactive
%
% Handle some cases of @value -- where it does not contain any
% (non-fully-expandable) commands.
This is a change in the \commondummies macro. With this change, with
the following input:
\input texinfo.tex
@documentencoding UTF-8
@contents
@chapter für
für
@bye
the following is created in the output auxiliary table of contents file:
@numchapentry{f@"ur}{1}{}{1}
Without it, it would be
@numchapentry{für}{1}{}{1}
Do you understand now how changing the active definitions can change
what's written to the output files?