We previously discussed the interface of the XS modules and trying to
improve their abstraction so that the same interface was used for both
XS and non-XS code.  (Emails from 2024-02-23 "Better abstraction of
Texinfo::Document interface regardless of XS".)  We made changes in
this direction.

There is still code like this in multiple places in the code, e.g. in
texi2any.pl:

# XS parser and not explicitely unset
my $XS_structuring = ((not defined($ENV{TEXINFO_XS})
                        or $ENV{TEXINFO_XS} ne 'omit')
                       and (not defined($ENV{TEXINFO_XS_PARSER})
                            or $ENV{TEXINFO_XS_PARSER} eq '1')
                       and (not defined($ENV{TEXINFO_XS_STRUCTURE})
                            or $ENV{TEXINFO_XS_STRUCTURE} ne '0'));

I think this makes the code harder to understand and is probably not
necessary.  It will make the code harder to maintain in the future.
(E.g., it could be something that makes new contributors give up.)

I tried to eliminate this, but haven't succeeded so far in the time I've
spent on it.

I found the $XS_structuring variable was being passed to
Texinfo::Parser::parse_texi_file, and the difference that this made was
whether Texinfo::Document::build_document or Texinfo::Document::get_document
was called (in 'parse_texi_file', in Texinfo/XS/parsetexi/Parsetexi.pm).
If $XS_structuring was false, then it was 'build_document' that was called,
which builds more of the document data in Perl structures.

I first tried fixing the value of this variable so that get_document would
always be called:


diff --git a/tp/t/test_utils.pl b/tp/t/test_utils.pl
index de26a822e0..78238537df 100644
--- a/tp/t/test_utils.pl
+++ b/tp/t/test_utils.pl
@@ -119,6 +119,7 @@ my $XS_structuring = ((not defined($ENV{TEXINFO_XS})
                             or $ENV{TEXINFO_XS_PARSER} eq '1')
                        and (not defined($ENV{TEXINFO_XS_STRUCTURE})
                             or $ENV{TEXINFO_XS_STRUCTURE} ne '0'));
+$XS_structuring = 1;
 
 my $generated_texis_dir = 't_texis';
 
diff --git a/tp/texi2any.pl b/tp/texi2any.pl
index 7f6327f504..d4154ca91b 100755
--- a/tp/texi2any.pl
+++ b/tp/texi2any.pl
@@ -1430,6 +1430,8 @@ my $XS_structuring = ((not defined($ENV{TEXINFO_XS})
                        and (not defined($ENV{TEXINFO_XS_STRUCTURE})
                             or $ENV{TEXINFO_XS_STRUCTURE} ne '0'));
 
+$XS_structuring = 1;
+
 if (defined($ENV{TEXINFO_XS_EXTERNAL_CONVERSION})
     and $ENV{TEXINFO_XS_EXTERNAL_CONVERSION}) {
   set_from_cmdline('XS_EXTERNAL_CONVERSION', 1);


I thought that the extra Perl structures were needed by the pure Perl
module Texinfo/Structuring.pm.  Access to these structures is through
accessor functions in Texinfo/Document.pm; (e.g. 'nodes_list') however,
the Perl structures are not built on demand in the pure Perl accessor
functions.  To try to compensate for this, I enabled the XS overrides in
Texinfo/Document.pm:

diff --git a/tp/Texinfo/Document.pm b/tp/Texinfo/Document.pm
index 4b8ceb1c99..76479499b1 100644
--- a/tp/Texinfo/Document.pm
+++ b/tp/Texinfo/Document.pm
@@ -95,7 +95,7 @@ sub import {
       for my $sub (keys %XS_overrides) {
         Texinfo::XSLoader::override ($sub, $XS_overrides{$sub});
       }
-      if ($XS_structuring) {
+      if ($XS_parser or $XS_structuring) {
         for my $sub (keys %XS_structure_overrides) {
           Texinfo::XSLoader::override ($sub, $XS_structure_overrides{$sub});
         }

(Another reason that this seems like a good change to make is that there
seems to be no reason that Texinfo/Document.pm should have a different
implementation depending on the value of TEXINFO_XS_STRUCTURE.)

This does appear to make texi2any work with TEXINFO_XS_STRUCTURE=0, although
there are various test failures that I haven't investigated properly
It appears that the failures are to do with @item; e.g. for
"converters_tests.t indices_in_begin_tables_lists_entries_after_item", the
diff includes

@@ -3848,15 +3846,13 @@ 
$result_texis{'indices_in_begin_tables_lists_entries_after_item'} = '\\input tex
 
 @itemize @minus
 @c comment in itemize
-@item 
 @cindex also a cindex in itemize
-e--mph item
+@item e--mph item
 @end itemize
 
 @itemize @bullet
-@item 
 @cindex index entry within itemize
-i--tem 1
+@item i--tem 1
 @item @cindex index entry right after @@item
 i--tem 2
 @end itemize

- which appears to be the result of some tree transformation not being
reflected.

I guess the main issue is with having a tree in both Perl and C and the
Perl transformation not being reflected in the C tree structure which
is then used for the output.

So I think this approach is close to working if I can work out what
is going on with the right version of the tree being used.  Hopefully,
then much of the special-casing for TEXINFO_XS_STRUCTURE throughout the
code for texi2any would no longer be needed.  Do you think this is right,
or have I missed something?

Another option is to get rid of TEXINFO_XS_STRUCTURE completely.  As you
may remember, we introduced this option after the last Texinfo release,
when the new XS code had introduced a degree of slow-down and we were
looking for ways to disable the new XS code while this could be fixed.
Since this slow-down was in fact later fixed, it is not as necessary.  I
expect that I would find it easy to remove TEXINFO_XS_STRUCTURE, while I
am still not sure how easily I could get the approach above to work.

Do you have any idea as to the best way to proceed?


Reply via email to