index entries

Eli Zaretskii Fri, 19 Aug 2022 23:07:00 -0700

> Date: Fri, 19 Aug 2022 22:33:34 +0200
> From: Patrice Dumas <[email protected]>
> 
> In general I think that info should decode nodes to UTF-8 or something
> like that when they are in 8bit encodings and there is a need to search
> in them, and, when outputting on the terminal, should encode to the
> locale.  I attach an example with an index done in a latin1 encoded
> file, the index entry with an accented letter is not found, hinting that
> the latin1 encoded index entry is not decoded to UTF-8, and the index
> entry encoded in latin1 is not well output.  The command line should
> also be decoded to UTF-8.


The command line should be decoded using the locale's codeset, not
UTF-8.  Because that's how the shell works: it uses the current
locale's codeset.

As for decoding the document, given that we have the @documentencoding
directive, which could specify any encoding whatsoever, the Info
reader should use the encoding specified for the document.  This is
already fixed for the Emacs reader, which uses the 'coding:' cookie at
the end of the Info file, so the simplest thing for the stand-alone
reader is to use the same.

When outputting to the terminal, the reader should indeed use the
locale's encoding.

Re: info --apropos should decode/encode nodes/index entries

Reply via email to