Hi, I believe I've found the issue with makeinfo on Perl 5.28. Aside from the trivial unescaped left brace things, there's a change in locale handling that can indeed cause a busy loop for some inputs under non-UTF8 locales.
I've been able to reduce the necessary input to just two lines: @documentencoding UTF-8 @indicateurl{foo} The busy loop happens in xspara__add_next() (Texinfo::Convert::XSParagraph) when mbrtowc(3) returns -1, indicating an invalid multibyte sequence even though it is valid UTF-8. Perl 5.28 introduced thread-safe locales, where setlocale() only affects the locale of the current thread. Apparently external code like mbrtowc(3) isn't aware of this thread specific locale without special handling. I'm attaching a patch that fixes this for me, and another one for the necessary left brace escaping. Even with this, I suppose xspara.c could use some mbrtowc(3) return value checks. Some pointers: https://metacpan.org/pod/distribution/perl/pod/perldelta.pod#Locales-are-now-thread-safe-on-systems-that-support-them https://metacpan.org/pod/distribution/perl/dist/ExtUtils-ParseXS/lib/perlxs.pod#CAVEATS https://perl5.git.perl.org/perl.git/commitdiff/e9bc6d6b34a https://perl5.git.perl.org/perl.git/commitdiff/58e641fba50 Hope this helps, -- Niko Tyni nt...@debian.org
>From 9031aefb7f180f718db83aec5e2782079455a32f Mon Sep 17 00:00:00 2001 From: Niko Tyni <nt...@debian.org> Date: Sat, 30 Jun 2018 16:51:13 +0100 Subject: [PATCH] Update locale handling for Perl 5.28 Perl 5.28 introduced thread-safe locales, where setlocale() only affects the locale of the current thread. External code like mbrtowc(3) isn't aware of this thread specific locale, so we need to explicitly modify the global one instead. Without this we could enter a busy loop in xspara__add_next() (Texinfo::Convert::XSParagraph) for UTF-8 documents when mbrtowc(3) returned -1. --- tp/Texinfo/Convert/XSParagraph/xspara.c | 9 +++++++++ 1 file changed, 9 insertions(+) diff --git a/tp/Texinfo/Convert/XSParagraph/xspara.c b/tp/Texinfo/Convert/XSParagraph/xspara.c index 51eea4a..f2d6d1c 100644 --- a/tp/Texinfo/Convert/XSParagraph/xspara.c +++ b/tp/Texinfo/Convert/XSParagraph/xspara.c @@ -248,6 +248,11 @@ xspara_init (void) dTHX; +#if PERL_VERSION > 27 || (PERL_VERSION == 27 && PERL_SUBVERSION > 8) + /* needed due to thread-safe locale handling in newer perls */ + switch_to_global_locale(); +#endif + if (setlocale (LC_CTYPE, "en_US.UTF-8") || setlocale (LC_CTYPE, "en_US.utf8")) goto success; @@ -320,6 +325,10 @@ failure: { success: ; free (utf8_locale); +#if PERL_VERSION > 27 || (PERL_VERSION == 27 && PERL_SUBVERSION > 8) + /* needed due to thread-safe locale handling in newer perls */ + sync_locale(); +#endif /* fprintf (stderr, "tried to set LC_CTYPE to UTF-8.\n"); fprintf (stderr, "character encoding is: %s\n", -- 2.17.0
>From 1f27900352e04ff4f19bec1c1e9635adad2be31c Mon Sep 17 00:00:00 2001 From: Niko Tyni <nt...@debian.org> Date: Fri, 18 May 2018 10:40:00 +0100 Subject: [PATCH] Fix unescaped left braces in regexps, deprecated since Perl 5.27.8 This fixes test failures on recent Perl versions. --- tp/Texinfo/Parser.pm | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/tp/Texinfo/Parser.pm b/tp/Texinfo/Parser.pm index dc32ca2..c577aa9 100644 --- a/tp/Texinfo/Parser.pm +++ b/tp/Texinfo/Parser.pm @@ -5478,11 +5478,11 @@ sub _parse_special_misc_command($$$$) } } elsif ($command eq 'clickstyle') { # REMACRO - if ($line =~ /^\s+@([[:alnum:]][[:alnum:]\-]*)({})?\s*/) { + if ($line =~ /^\s+@([[:alnum:]][[:alnum:]\-]*)(\{\})?\s*/) { $args = ['@'.$1]; $self->{'clickstyle'} = $1; $remaining = $line; - $remaining =~ s/^\s+@([[:alnum:]][[:alnum:]\-]*)({})?\s*(\@(c|comment)((\@|\s+).*)?)?//; + $remaining =~ s/^\s+@([[:alnum:]][[:alnum:]\-]*)(\{\})?\s*(\@(c|comment)((\@|\s+).*)?)?//; $has_comment = 1 if (defined($4)); } else { $self->line_error (sprintf($self->__( -- 2.17.0