tags 699632 + upstream fixed-upstream
quit
Hi Thomas,
Thomas Dickey wrote:
> On Sat, Feb 02, 2013 at 06:14:14PM +0000, François Wendling wrote:
>> I have found a bug in mawk. I wanted to strip HTML tags the dirty way, and
>> wanted to make a perlish non greedy match (i since discovered it wasn't
>> possible...), but on the way, i made a typo and executed this :
>>
>> # You don't want to run this ;)
>> mawk '{gsub("<.+*>"," ")} NF > 0 {print}' /tmp/html
>
> It doesn't appear to be a problem with current mawk:
Presumably that is because of the following patch:
+ modify regular-expression engine to avoid exponential running time
for some regular expression matches in which the first match mawk
finds extends to the end of the string.
I can rebase the same against Debian mawk if someone is interested
in using it.
Unfortunately, as the description implies, it doesn't address some
similar cases. A more complete fix would involve switching to a
DFA-based engine instead of a backtracking one.
Regards,
Jonathan
--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]