tags 699632 + upstream fixed-upstream
quit

Hi Thomas,

Thomas Dickey wrote:
> On Sat, Feb 02, 2013 at 06:14:14PM +0000, François Wendling wrote:

>> I have found a bug in mawk. I wanted to strip HTML tags the dirty way, and
>> wanted to make a perlish non greedy match  (i since discovered it wasn't
>> possible...), but on the way, i made a typo and executed this : 
>>
>> # You don't want to run this ;)
>> mawk '{gsub("<.+*>"," ")} NF > 0 {print}' /tmp/html 
>
> It doesn't appear to be a problem with current mawk:

Presumably that is because of the following patch:

        + modify regular-expression engine to avoid exponential running time
          for some regular expression matches in which the first match mawk
          finds extends to the end of the string.

I can rebase the same against Debian mawk if someone is interested
in using it.

Unfortunately, as the description implies, it doesn't address some
similar cases.  A more complete fix would involve switching to a
DFA-based engine instead of a backtracking one.

Regards,
Jonathan


--
To UNSUBSCRIBE, email to [email protected]
with a subject of "unsubscribe". Trouble? Contact [email protected]

Reply via email to