On Mon, Oct 11, 2021 at 11:35:06AM +0000, Alan Mackenzie wrote:
> If there are any other formatting characters above 0x7f inserted by
> Texinfo, I would also like their "ASCII" equivalents to be used instead.
I've checked with a test file and the output without @documentencoding
is close to what you ask for.
\input texinfo
@c @documentencoding UTF-8
@dfn{foo}
@code{code}
`bar'
`hello'
``oompa''
a---b
c--d
Herr M@"uller will Sie sprechen.
@bye
$ ./texi2any.pl test.texi -c OPEN_QUOTE_SYMBOL=\` -c CLOSE_QUOTE_SYMBOL=\'
test.texi: warning: document without nodes
$ cat test.info
This is test.info, produced by texi2any version 6.8dev+dev from
test.texi.
"foo"
`code'
'bar'
'hello'
"oompa"
a--b
c-d
Herr Müller will Sie sprechen.
Tag Table:
End Tag Table
Local Variables:
coding: utf-8
End:
$
Notice the OPEN_QUOTE_SYMBOL wasn't used in som of the cases.
With the @documentencoding line not commented out it is:
This is test.info, produced by texi2any version 6.8dev+dev from
test.texi.
“foo”
`code'
‘bar’
‘hello’
“oompa”
a—b
c–d
Herr Müller will Sie sprechen.
Tag Table:
End Tag Table
Local Variables:
coding: utf-8
End:
again with the OPEN_QUOTE_SYMBOL and CLOSE_QUOTE_SYMBOL not affecting the
the output for ` and ' - arguably a bug.
> > If you remove "@documentencoding UTF-8" from a file, the file is still
> > assumed to be in UTF-8, but less Unicode is used in the output where it
> > is not necessary. Does that help?
>
> Not really. I've got too many info files on my system (Gentoo
> GNU/Linux) to remove that directive from them all each time there's a
> new version of the file.texi.
>
> So, I'm asking you to implement such an option in the next version of
> Texinfo, or perhaps accept a patch from me which would do this.
Yes I think it is a valid desire to have such an option, especially as such
an output is already available by changing the use of @documentencoding.
(That's why I made @documentencoding have this effect in the first place,
to give the chance to avoid having unnecessary UTF-8 sequences in Info files.)
Look at where the 'no_extra_unicode' flag is set in
Texinfo/Convert/Plaintext.pm - any option should use the same code as this.