https://bz.apache.org/SpamAssassin/show_bug.cgi?id=7374
Joe Quinn <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #2 from Joe Quinn <[email protected]> --- It looks like the regex in question is this: if ($text !~ /^(?:[ \t\n\r\f\x0b]|\xc2\xa0)*\z/s) { $invisible_for_bayes = $self->html_font_invisible($text); } It looks for a line that contains something other than a certain set of whitespace characters. >From perldiag: Complex regular subexpression recursion limit (%d) exceeded (W regexp) The regular expression engine uses recursion in complex situations where back-tracking is required. Recursion depth is limited to 32766, or perhaps less in architectures where the stack cannot grow arbitrarily. ("Simple" and "medium" situations are handled without recursion and are not subject to a limit.) Try shortening the string under examination; looping in Perl code (e.g. with while ) rather than in the regular expression engine; or rewriting the regular expression so that it is simpler or backtracks less. (See perlfaq2 for information on Mastering Regular Expressions.) This regex must be sufficiently not-simple that it gets solved with recursion, and with enough stuff to crunch through it hits that limit of 32k. I think this has to do with the quantifier having a nested choice between single-byte whitespace and two-byte NBSP making backtracking more complicated. I believe we can eliminate backtracking entirely here because this regex will never succeed on a less than totally greedy match. if ($text !~ /^(?>[ \t\n\r\f\x0b]|\xc2\xa0)*\z/s) { $invisible_for_bayes = $self->html_font_invisible($text); } see also: http://stackoverflow.com/questions/26226630/latest-perl-wont-match-certain-regexes-more-than-32768-characters-long http://perldoc.perl.org/perlre.html#(%3f%3epattern) -- You are receiving this mail because: You are the assignee for the bug.
