On Fri, Apr 24, 2026 at 12:53:15PM +0000, Vincent Belaïche wrote:
> Dear Gavin & al.,
> 
> I don't think that a solution that would be fit only for French, German
> and Swedish is a good idea.
> 
> Coming back to the patch I am proposing, you are arguing that it is not
> good because it relies on the installed locales. I think that this is
> not a problem of the proposed patch but of the awk based texindex that
> is limited to installed locale. I am expecting that if someone wants to
> compile a document for their own language, they have the corresponding
> locale installed. So the point with this patch --- as far as Emacs or
> any other SW compilation is concerned --- is rather that the compilation
> should not be broken because of a missing locale breaking the
> documentation compilation. This is something that the autogen
> configuration should check: look at the defined locales and restrict the
> documentation languages to what is installed. Or maybe the configuration
> step could by default restrict to manuals not needing any locale, and
> have an option to allow other manuals. Or alternatively, the
> configuration script should look at the texindex version, and if it is a
> version known to be dependant on installed locales, then restrict doc
> doc compilation to manuals not needing installed locales.
> 
> So, in a nutshell, the root cause is that texindex is dependant on
> installed locales for sorting, and I would like to react to your
> (Gavin's) comment about rewriting texindex. Basically, I disagree with
> you that it would be overcomplex (I am not volunteering however to go
> into this :-D ), because if texindex was to be rewritten one would not
> rewrite it from scratch but rather build on some existing indexing
> program, which means the amount of code to produce would not be that
> huge and complex. Biber would be a good candidate because it is a Perl
> program and texi2any is also Perl. I don't know how biber is coded, but
> I suspect it has something like a processing module and something like a
> cli module, and one could just import the processing module into a
> program like tex2any or a novel texindex Perl toplevel script in order
> to make the job. So using biber would also have the great advantage that
> both tex2any and texindex would use the same indexing engine under the
> hood. Or maybe the biber cli is general enough so that the novel
> texindex could be built on top of the cli (or texi2dvi directly call
> biber with suitable parameters).
> 
> With using biber, texinfo would just rely on default sorting rules made
> for each language, and texinfo maintainers would not need to develop
> anything every time a new language support is added, just inherit from
> the work done for LaTeX BibLaTeX/biber.

Sorry, I can't cope with texindex being rewritten in Perl right now.
Rewriting on top of some other program and completely
changing the format of the index files that texinfo.tex outputs is
just not manageable.  We already have too much to deal with in texi2any
in terms of internationalisation support.  Compared with texi2any,
texindex is a breath of fresh air.  I think I said already that the awk
implementation is simple, self-contained and portable.  I don't want it
to be dragged into the same quagmire of Perl/C/XS/gnulib/libunistring.


> Furthermore, biber doc (biber v2.21, §1 change 1.9) says:
> 
>       Biber no longer checks the environment for locales to use for
>       sorting.
> 
> which basically solves the current issue: biber has its own set of
> language specific default sorting rules, no dependance on installed
> locales.
> 
> So a texinfo/texindex built on top of Biber would
> 
> - adapt the texinfo.tex index output format to the biber input
>   format. That is to say, instead of outputing to ses-fr.cp this line
> 
>   @entry{ses-read-symbol}{4}{@code {ses-read-symbol}}
> 
>   it would output something in a .bib-like format as follows:
> 
>   @entry{key-18,
>     key={ses-read-symbol},
>     page={4},
>     value={@code {ses-read-symbol}}
>   }
> 
>   texinfo.tex would also need to output some .bcf control file to list
>   all the datasources, encoding, languages, and tell that all .bib entry
>   are cited, etc.

You've got no idea how difficult this would be to implement and how many
problems this would cause.

> - adapt biber processing modules in order to
>   - support the texinfo accent macro expansion. Basically this amounts
>     to telling biber that the cat 0 character is not \ but @, as the
>     macros are basically the same in LaTeX and Texinfo, for instance é
>     is {\'e} in LaTeX and @'{e} in Texinfo. Ok, there are quite a few
>     differences, those could be handled in the .bib production process,
>     for instance Texinfo ç is @,{c} and it could be translated by
>     texinfo.tex to LaTex+catcode64=0 {@c c} during the .bib production
>     process.
> 
>   - support the output format that texinfo uses as entry for indexes, ie
>     instead of the \bitem \newblock stuff of .bbl files, uses some
>     @initial{...} @entry{...} output.
> 
> - make some main texindex Perl toplevel script, that would
>   - use biber processing module,
>   - pass a configuration suitable for Texinfo to biber processing module,
>   - parse CLI arguments to get the filename radix from texi2dvi.
> 
> 
>     Vincent.

Reply via email to