On Fri, Feb 02, 2024 at 08:57:01AM +0200, Eli Zaretskii wrote: > > From: Gavin Smith <gavinsmith0...@gmail.com> > > Date: Thu, 1 Feb 2024 22:16:07 +0000 > > Cc: Patrice Dumas <pertu...@free.fr>, bug-texinfo@gnu.org > > > > On Thu, Feb 01, 2024 at 09:01:42AM +0200, Eli Zaretskii wrote: > > > > Date: Wed, 31 Jan 2024 23:11:02 +0100 > > > > From: Patrice Dumas <pertu...@free.fr> > > > > > > > > That would not be difficult to implement as a customization variable. > > > > What about COLLATION_LANGUAGE? > > > > > > What would be the possible values of this variable, and in what format > > > will those values be specified? > > > > I imagine it would be a locale name for passing to newlocale and thence > > to strxfrm_l. What Patrice implemented hardcord the name "en_US.utf-8" > > but this would be a possible value. > > I think en_US.utf-8 is (or at least can be by default) a combination > of @documentlanguage and @documentencoding.
I try to make the index collation as independent as possible of @documentencoding and output encoding. Here the utf-8 is meant to provide a sorting 'independent' of the encoding. Regarding the language for now the aim was to have something as similar as the Perl output, which is obtained without a locale. The choice of en_US was motivated by that aim. I looked at the /usr/lib/locale/*/LC_COLLATE files on my debian GNU/Linux and there was no "en.utf-8", which would have been my first choice, so I used "en_US.utf-8". -- Pat