* Chris Kuklewicz <hask...@list.mightyreason.com> [2011-09-13 08:21:55+0100]
> I wrote regex-tdfa, the efficient (though not yet lightning fast) Posix-like 
> engine.
> 
> You are not the first to want an efficient Perl-like engine.  The answer you
> seek flows from Russ Cox (though in C++):
> 
> > http://google-opensource.blogspot.com/2010/03/re2-principled-approach-to-regular.html
> 
> > http://code.google.com/p/re2/
> 
> Quoting relevant bit:
> 
> > It also finds the leftmost-first match, the same match that Perl would, and
> > can return submatch information. The one significant exception is that RE2
> > drops support for backreferences¹ and generalized zero-width assertions,
> > because they cannot be implemented efficiently.

Hi Chris, thanks for the response.

There's one thing about Cox's work that I don't understand. On the
page [1] he mentions that the algorithm falls back to direct NFA
simulation (search for "If all else fails" on that page).
This defeats his goal of defending against DoS attacks (the second
paragraph of that page).

And I wouldn't be comfortable with an algorithm that is worst-case
exponential either.

Then there's another issue: I specifically want a combinator library,
and not every automaton-based algorithm can serve this purpose in a
statically typed language (unless there's some clever trick I don't
know).

[1]: http://swtch.com/~rsc/regexp/regexp3.html

-- 
Roman I. Cheplyaka :: http://ro-che.info/

_______________________________________________
Haskell-Cafe mailing list
Haskell-Cafe@haskell.org
http://www.haskell.org/mailman/listinfo/haskell-cafe

Reply via email to