On Sun, Mar 16, 2025 at 01:47:13PM +0100, pertu...@free.fr wrote:
> On Wed, Mar 12, 2025 at 12:40:54AM +0100, pertu...@free.fr wrote:
> > On Tue, Mar 11, 2025 at 09:22:55PM +0000, Gavin Smith wrote:
> > > On Tue, Mar 11, 2025 at 10:02:25AM +0100, pertu...@free.fr wrote:
> > > > On Mon, Mar 10, 2025 at 07:28:54PM +0000, Gavin Smith wrote:
> > > > > Isn't that changing the purpose of --transliterate-file-names?
> > > > > 
> > > > > Before: "texi2any --transliterate-file-names" means to output files
> > > > > with transliterated file names.
> > > > > 
> > > > > After: "texi2any --transliterate-file-names" means to output files
> > > > > without transliterated files, and additionally to output redirection
> > > > > files with transliterated names.
> > > > 
> > > > Sorry, I was not clear, what I propose is to output transliterated
> > > > files, and additionally output redirection files with transliterated 
> > > > names.
> > > > (and also output redirection with non transliterated names).
> 
> I did the change I proposed, including changes in the documentation,
> please check that it is ok.
> 
> I remained somewhat vague on what happens for redirection files with
> --transliterate-file-names on purpose, but right now it should create
> redirection files both for transliterated and non transliterated targets
> when there is not already a corresponding file.

Link to archived discussion:
https://lists.gnu.org/archive/html/bug-texinfo/2025-03/msg00066.html

I kind of ran out of steam on this issue due to the many different
aspects being discussed and am trying to come back to it now.

I checked the current behaviour with the development version and I am
happy with the default behaviour.  When filenames are being transliterated
(i.e. with TRANSLITERATE_FILE_NAMES), it makes sense to me to output
redirection files (with non-transliterated names), so that other manuals
can link to it.  That takes care of the link stability issue.

Since I doubt that very many online manuals, if any, rely on transliterated
file names for working URLs, I'd prefer to avoid adding any kind of
configuration of the kinds of transliteration used in generated links,
whether it be with configuration files (e.g. htmlxref.cnf) or command-line
options.

The current (git) behaviour of using the non-transliterated file names
in external links seems the best to me.  (That means using the algorithm
described in the "HTML Xref Node Name Expansion" node of the Texinfo manual,
and not passing the node name through the Text::Unidecode module.)

When testing texi2any to find out what the current behaviour was (as
I couldn't really remember), it occurred to me that the effect of
TRANSLITERATE_FILE_NAMES and ADD_TRANSLITERATED_REDIRECTION_FILES is
very similar:

* TRANSLITERATE_FILE_NAMES: main output in files with transliterated
names; create redirection files with non-transliterated names

* ADD_TRANSLITERATED_REDIRECTION_FILES: main output in files with
non-transliterated names; create redirection files with transliterated names.

I know you (Patrice) said that ADD_TRANSLITERATED_REDIRECTION_FILES would
be a temporary measure, but we should question whether we even need it
in the first place.  It could be enough for manuals to use
TRANSLITERATE_FILE_NAMES if they want links to work to the manual with
transliterated names.  I was aware of an important difference in only
one case: when you have two nodes names transliterating to the same
string.  Then the redirection file for the resulting string will only
go to one of the nodes.  For example:

    \input texinfo

    @node Döngo

    one

    @node Dóngo

    two

    @bye

Running:

../../tta/perl/texi2any.pl --html test.texi -c 
ADD_TRANSLITERATED_REDIRECTION_FILES=1

- there is an output file "test_html/Dongo.html" generated, which contains
a redirection:

    <meta http-equiv="Refresh" content="0; url=D_00f6ngo.html">

However, there is not just test_html/D_00f6ngo.html, but also
test_html/D_00f3ngo.html.  So links to one of these nodes would go to the
wrong place.

With the other variable, it would work properly:

../../tta/perl/texi2any.pl --html test.texi -c TRANSLITERATE_FILE_NAMES=1

Then test_html/Dongo.html has both nodes in it:

<body lang="">
<h4 class="node" id="D_00f6ngo"><span>Döngo<a class="copiable-link" 
href="#D_00f6ngo"> &para;</a></span></h4>

<p>one
</p>
<hr>
<h4 class="node" id="D_00f3ngo"><span>Dóngo<a class="copiable-link" 
href="#D_00f3ngo"> &para;</a></span></h4>

<p>two
</p>


</body>

The redirection files go to the appropriate "fragments" within Dongo.html,
i.e.

    <meta http-equiv="Refresh" content="0; url=Dongo.html#D_00f3ngo">

and

    <meta http-equiv="Refresh" content="0; url=Dongo.html#D_00f6ngo">

so all links should work.

Hence, TRANSLITERATE_FILE_NAMES seems to be more reliable than
ADD_TRANSLITERATED_REDIRECTION_FILES.

I question who would actually be setting ADD_TRANSLITERATED_REDIRECTION_FILES
for the benefit of links to their manual.

I think it is confusing to have two variables which kind of do the same
thing, but not exactly, with similar names, for a feature which is hardly
ever used.  It will be hard for users to remember the difference.  In fact,
it will be hard for us, the developers, to remember the difference.

The implementation of the two variables seems intertwined in texi2any.

~

Also, I found a bug.  There doesn't seem to be validity checking for
the values of customisation variables given on the command line:

../../tta/perl/texi2any.pl --html test.texi -c 
ADD_TRANSLITERATED_REDIRECTION_FILES=
BUG: ADD_TRANSLITERATED_REDIRECTION_FILES: not an integer: 
ERROR: ADD_TRANSLITERATED_REDIRECTION_FILES unexpected conf error
BUG: ADD_TRANSLITERATED_REDIRECTION_FILES: not an integer: 
BUG: ADD_TRANSLITERATED_REDIRECTION_FILES: not an integer: 
ERROR: ADD_TRANSLITERATED_REDIRECTION_FILES unexpected conf error
BUG: ADD_TRANSLITERATED_REDIRECTION_FILES: not an integer: 
ERROR: ADD_TRANSLITERATED_REDIRECTION_FILES unexpected conf error
test.texi: warning: must specify a title with a title command or @top
test.texi:11: warning: no HTML cross-references entry found for `ext'

I guess that if an invalid value is given on the command line, it should
be detected and not used as the value, rather than causing bugs and errors
to be reported throughout the program.


Reply via email to