Below is a patch to introduce a new variable to avoid using the
Unicode::Collate module.  Turning the module off shortens run times by
about 5% (depending on the size of indices in the document).

Users can use this if they don't find texi2any fast enough, if they
don't care about having the indices sorted correctly, if they don't
have many non-ASCII characters in index entry text, or just for working
on a manual.

I propose that USE_UNICODE_COLLATE is on by default, as is currently
the case, to have correct index sorting by default, as the impact is
relatively small.

Patrice, is this the correct way to add a customization variable (as
I am not familiar with the new Texinfo/options_data.txt file).

Sample timings:

$ time ../tp/texi2any.pl ../../libc/libc.texinfo -c USE_UNICODE_COLLATE=1
creature.texi:309: warning: `.' or `,' must follow @xref, not f

real    0m6.009s
user    0m5.586s
sys     0m0.421s


$ time ../tp/texi2any.pl ../../libc/libc.texinfo -c USE_UNICODE_COLLATE=0
creature.texi:309: warning: `.' or `,' must follow @xref, not f

real    0m5.821s
user    0m5.460s
sys     0m0.360s


$ time ../tp/texi2any.pl ../../emacs-lispref-27.2/elisp.texi -c 
USE_UNICODE_COLLATE=1
functions.texi:2390: warning: @inforef is obsolete
errors.texi:226: warning: unexpected argument on @ignore line: The following 
seem to be unused now.

real    0m5.383s
user    0m5.146s
sys     0m0.237s

$ time ../tp/texi2any.pl ../../emacs-lispref-27.2/elisp.texi -c 
USE_UNICODE_COLLATE=0
functions.texi:2390: warning: @inforef is obsolete
errors.texi:226: warning: unexpected argument on @ignore line: The following 
seem to be unused now.

real    0m4.960s
user    0m4.739s
sys     0m0.221s


diff --git a/tp/Texinfo/Indices.pm b/tp/Texinfo/Indices.pm
index a9c31b2d24..c1032bb199 100644
--- a/tp/Texinfo/Indices.pm
+++ b/tp/Texinfo/Indices.pm
@@ -392,7 +392,8 @@ sub setup_sortable_index_entries($$$$$)
   my $collator;
   eval { require Unicode::Collate; Unicode::Collate->import; };
   my $unicode_collate_loading_error = $@;
-  if ($unicode_collate_loading_error eq '') {
+  if ($unicode_collate_loading_error eq ''
+        and $customization_information->get_conf('USE_UNICODE_COLLATE')) {
     $collator = Unicode::Collate->new(%collate_options);
   } else {
     $collator = Texinfo::CollateStub->new();
diff --git a/tp/Texinfo/options_data.txt b/tp/Texinfo/options_data.txt
index e972d75f6b..d0788b1eed 100644
--- a/tp/Texinfo/options_data.txt
+++ b/tp/Texinfo/options_data.txt
@@ -185,6 +185,7 @@ TEXTCONTENT_COMMENT                converter_customization 
undef   integer
 # be good to update from time to time to avoid test results that are not
 # valid against their reported DTD.
 TEXINFO_DTD_VERSION                converter_customization 7.1     char
+USE_UNICODE_COLLATE                converter_customization 1       integer
 
 # Some are for all converters, EXTENSION for instance, some for
 # some converters, for example CLOSE_QUOTE_SYMBOL and many


Reply via email to