Excerpts from Edward Z. Yang's message of Fri Jun 05 17:47:00 -0400 2009: > Now that you mention it, the messages that tickle this bug on my side also > have one extremely long line. That's very interesting.
Here is the culprit, laid out to bear its full shame: /\w.*:$/ I thought this was a suspicious looking regexen; a simple test confirmed my belief: line = ":a" * 10000 line =~ /\w.*:$/ Ba boom ba boom ba boom. This is a textbook case of catastrophic backtracking. I have two possible fixes, they end up being about the same time for regular cases, but the second one is more optimal for really long strings: First, the simple one: diff --git a/lib/sup/message.rb b/lib/sup/message.rb index 5993729..0ddd3af 100644 --- a/lib/sup/message.rb +++ b/lib/sup/message.rb @@ -26,7 +26,7 @@ class Message QUOTE_PATTERN = /^\s{0,4}[>|\}]/ BLOCK_QUOTE_PATTERN = /^-----\s*Original Message\s*----+$/ - QUOTE_START_PATTERN = /\w.*:$/ + QUOTE_START_PATTERN = /\w\W*:$/ SIG_PATTERN = /(^-- ?$)|(^\s*----------+\s*$)|(^\s*_________+\s*$)|(^\s*--~--~-)|(^\s*--\+\+\*\*==)/ MAX_SIG_DISTANCE = 15 # lines from the end And the slightly more complicated one (but optimal for large n): diff --git a/lib/sup/message.rb b/lib/sup/message.rb index 5993729..c5481a6 100644 --- a/lib/sup/message.rb +++ b/lib/sup/message.rb @@ -26,7 +26,6 @@ class Message QUOTE_PATTERN = /^\s{0,4}[>|\}]/ BLOCK_QUOTE_PATTERN = /^-----\s*Original Message\s*----+$/ - QUOTE_START_PATTERN = /\w.*:$/ SIG_PATTERN = /(^-- ?$)|(^\s*----------+\s*$)|(^\s*_________+\s*$)|(^\s*--~--~-)| MAX_SIG_DISTANCE = 15 # lines from the end @@ -449,7 +448,7 @@ private when :text newstate = nil - if line =~ QUOTE_PATTERN || (line =~ QUOTE_START_PATTERN && nextline =~ QUO + if line =~ QUOTE_PATTERN || (line =~ /:$/ && line =~ /\w/ && nextline =~ QU newstate = :quote elsif line =~ SIG_PATTERN && (lines.length - i) < MAX_SIG_DISTANCE newstate = :sig There are number of micro-optimizations that could be made to message parsing, but this will basically fix the egregious problem. Cheers, Edward _______________________________________________ sup-talk mailing list sup-talk@rubyforge.org http://rubyforge.org/mailman/listinfo/sup-talk