We previously discussed the interface of the XS modules and trying to improve their abstraction so that the same interface was used for both XS and non-XS code. (Emails from 2024-02-23 "Better abstraction of Texinfo::Document interface regardless of XS".) We made changes in this direction.
There is still code like this in multiple places in the code, e.g. in texi2any.pl: # XS parser and not explicitely unset my $XS_structuring = ((not defined($ENV{TEXINFO_XS}) or $ENV{TEXINFO_XS} ne 'omit') and (not defined($ENV{TEXINFO_XS_PARSER}) or $ENV{TEXINFO_XS_PARSER} eq '1') and (not defined($ENV{TEXINFO_XS_STRUCTURE}) or $ENV{TEXINFO_XS_STRUCTURE} ne '0')); I think this makes the code harder to understand and is probably not necessary. It will make the code harder to maintain in the future. (E.g., it could be something that makes new contributors give up.) I tried to eliminate this, but haven't succeeded so far in the time I've spent on it. I found the $XS_structuring variable was being passed to Texinfo::Parser::parse_texi_file, and the difference that this made was whether Texinfo::Document::build_document or Texinfo::Document::get_document was called (in 'parse_texi_file', in Texinfo/XS/parsetexi/Parsetexi.pm). If $XS_structuring was false, then it was 'build_document' that was called, which builds more of the document data in Perl structures. I first tried fixing the value of this variable so that get_document would always be called: diff --git a/tp/t/test_utils.pl b/tp/t/test_utils.pl index de26a822e0..78238537df 100644 --- a/tp/t/test_utils.pl +++ b/tp/t/test_utils.pl @@ -119,6 +119,7 @@ my $XS_structuring = ((not defined($ENV{TEXINFO_XS}) or $ENV{TEXINFO_XS_PARSER} eq '1') and (not defined($ENV{TEXINFO_XS_STRUCTURE}) or $ENV{TEXINFO_XS_STRUCTURE} ne '0')); +$XS_structuring = 1; my $generated_texis_dir = 't_texis'; diff --git a/tp/texi2any.pl b/tp/texi2any.pl index 7f6327f504..d4154ca91b 100755 --- a/tp/texi2any.pl +++ b/tp/texi2any.pl @@ -1430,6 +1430,8 @@ my $XS_structuring = ((not defined($ENV{TEXINFO_XS}) and (not defined($ENV{TEXINFO_XS_STRUCTURE}) or $ENV{TEXINFO_XS_STRUCTURE} ne '0')); +$XS_structuring = 1; + if (defined($ENV{TEXINFO_XS_EXTERNAL_CONVERSION}) and $ENV{TEXINFO_XS_EXTERNAL_CONVERSION}) { set_from_cmdline('XS_EXTERNAL_CONVERSION', 1); I thought that the extra Perl structures were needed by the pure Perl module Texinfo/Structuring.pm. Access to these structures is through accessor functions in Texinfo/Document.pm; (e.g. 'nodes_list') however, the Perl structures are not built on demand in the pure Perl accessor functions. To try to compensate for this, I enabled the XS overrides in Texinfo/Document.pm: diff --git a/tp/Texinfo/Document.pm b/tp/Texinfo/Document.pm index 4b8ceb1c99..76479499b1 100644 --- a/tp/Texinfo/Document.pm +++ b/tp/Texinfo/Document.pm @@ -95,7 +95,7 @@ sub import { for my $sub (keys %XS_overrides) { Texinfo::XSLoader::override ($sub, $XS_overrides{$sub}); } - if ($XS_structuring) { + if ($XS_parser or $XS_structuring) { for my $sub (keys %XS_structure_overrides) { Texinfo::XSLoader::override ($sub, $XS_structure_overrides{$sub}); } (Another reason that this seems like a good change to make is that there seems to be no reason that Texinfo/Document.pm should have a different implementation depending on the value of TEXINFO_XS_STRUCTURE.) This does appear to make texi2any work with TEXINFO_XS_STRUCTURE=0, although there are various test failures that I haven't investigated properly It appears that the failures are to do with @item; e.g. for "converters_tests.t indices_in_begin_tables_lists_entries_after_item", the diff includes @@ -3848,15 +3846,13 @@ $result_texis{'indices_in_begin_tables_lists_entries_after_item'} = '\\input tex @itemize @minus @c comment in itemize -@item @cindex also a cindex in itemize -e--mph item +@item e--mph item @end itemize @itemize @bullet -@item @cindex index entry within itemize -i--tem 1 +@item i--tem 1 @item @cindex index entry right after @@item i--tem 2 @end itemize - which appears to be the result of some tree transformation not being reflected. I guess the main issue is with having a tree in both Perl and C and the Perl transformation not being reflected in the C tree structure which is then used for the output. So I think this approach is close to working if I can work out what is going on with the right version of the tree being used. Hopefully, then much of the special-casing for TEXINFO_XS_STRUCTURE throughout the code for texi2any would no longer be needed. Do you think this is right, or have I missed something? Another option is to get rid of TEXINFO_XS_STRUCTURE completely. As you may remember, we introduced this option after the last Texinfo release, when the new XS code had introduced a degree of slow-down and we were looking for ways to disable the new XS code while this could be fixed. Since this slow-down was in fact later fixed, it is not as necessary. I expect that I would find it easy to remove TEXINFO_XS_STRUCTURE, while I am still not sure how easily I could get the approach above to work. Do you have any idea as to the best way to proceed?