Hi, I believe I've found the issue with makeinfo on Perl 5.28.

Aside from the trivial unescaped left brace things, there's a change
in locale handling that can indeed cause a busy loop for some inputs
under non-UTF8 locales.

I've been able to reduce the necessary input to just two lines:

  @documentencoding UTF-8
  @indicateurl{foo}

The busy loop happens in xspara__add_next()
(Texinfo::Convert::XSParagraph) when mbrtowc(3) returns -1, indicating
an invalid multibyte sequence even though it is valid UTF-8.

Perl 5.28 introduced thread-safe locales, where setlocale() only affects
the locale of the current thread. Apparently external code like mbrtowc(3)
isn't aware of this thread specific locale without special handling.

I'm attaching a patch that fixes this for me, and another one for the
necessary left brace escaping. Even with this, I suppose xspara.c could
use some mbrtowc(3) return value checks.

Some pointers:

 
https://metacpan.org/pod/distribution/perl/pod/perldelta.pod#Locales-are-now-thread-safe-on-systems-that-support-them
 
https://metacpan.org/pod/distribution/perl/dist/ExtUtils-ParseXS/lib/perlxs.pod#CAVEATS
 https://perl5.git.perl.org/perl.git/commitdiff/e9bc6d6b34a
 https://perl5.git.perl.org/perl.git/commitdiff/58e641fba50

Hope this helps,
-- 
Niko Tyni   nt...@debian.org
>From 9031aefb7f180f718db83aec5e2782079455a32f Mon Sep 17 00:00:00 2001
From: Niko Tyni <nt...@debian.org>
Date: Sat, 30 Jun 2018 16:51:13 +0100
Subject: [PATCH] Update locale handling for Perl 5.28

Perl 5.28 introduced thread-safe locales, where setlocale()
only affects the locale of the current thread. External code
like mbrtowc(3) isn't aware of this thread specific locale,
so we need to explicitly modify the global one instead.

Without this we could enter a busy loop in xspara__add_next()
(Texinfo::Convert::XSParagraph) for UTF-8 documents when mbrtowc(3)
returned -1.
---
 tp/Texinfo/Convert/XSParagraph/xspara.c | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/tp/Texinfo/Convert/XSParagraph/xspara.c b/tp/Texinfo/Convert/XSParagraph/xspara.c
index 51eea4a..f2d6d1c 100644
--- a/tp/Texinfo/Convert/XSParagraph/xspara.c
+++ b/tp/Texinfo/Convert/XSParagraph/xspara.c
@@ -248,6 +248,11 @@ xspara_init (void)
 
   dTHX;
 
+#if PERL_VERSION > 27 || (PERL_VERSION == 27 && PERL_SUBVERSION > 8)
+  /* needed due to thread-safe locale handling in newer perls */
+  switch_to_global_locale();
+#endif
+
   if (setlocale (LC_CTYPE, "en_US.UTF-8")
       || setlocale (LC_CTYPE, "en_US.utf8"))
     goto success;
@@ -320,6 +325,10 @@ failure:
     {
 success: ;
       free (utf8_locale);
+#if PERL_VERSION > 27 || (PERL_VERSION == 27 && PERL_SUBVERSION > 8)
+      /* needed due to thread-safe locale handling in newer perls */
+      sync_locale();
+#endif
       /*
       fprintf (stderr, "tried to set LC_CTYPE to UTF-8.\n");
       fprintf (stderr, "character encoding is: %s\n",
-- 
2.17.0

>From 1f27900352e04ff4f19bec1c1e9635adad2be31c Mon Sep 17 00:00:00 2001
From: Niko Tyni <nt...@debian.org>
Date: Fri, 18 May 2018 10:40:00 +0100
Subject: [PATCH] Fix unescaped left braces in regexps, deprecated since Perl
 5.27.8

This fixes test failures on recent Perl versions.
---
 tp/Texinfo/Parser.pm | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/tp/Texinfo/Parser.pm b/tp/Texinfo/Parser.pm
index dc32ca2..c577aa9 100644
--- a/tp/Texinfo/Parser.pm
+++ b/tp/Texinfo/Parser.pm
@@ -5478,11 +5478,11 @@ sub _parse_special_misc_command($$$$)
     }
   } elsif ($command eq 'clickstyle') {
     # REMACRO
-    if ($line =~ /^\s+@([[:alnum:]][[:alnum:]\-]*)({})?\s*/) {
+    if ($line =~ /^\s+@([[:alnum:]][[:alnum:]\-]*)(\{\})?\s*/) {
       $args = ['@'.$1];
       $self->{'clickstyle'} = $1;
       $remaining = $line;
-      $remaining =~ 
s/^\s+@([[:alnum:]][[:alnum:]\-]*)({})?\s*(\@(c|comment)((\@|\s+).*)?)?//;
+      $remaining =~ 
s/^\s+@([[:alnum:]][[:alnum:]\-]*)(\{\})?\s*(\@(c|comment)((\@|\s+).*)?)?//;
       $has_comment = 1 if (defined($4));
     } else {
       $self->line_error (sprintf($self->__(
-- 
2.17.0

Reply via email to