Hi Sahank,
On 26.10.2013, at 05:01, Sashank Dara <[email protected]> wrote:
>
> Regular expressions in modern languages are not "pure"RE . they have
> additional features using back lash which causes ambiguity and some times
> even results in ReDOS attacks .
>
> Now how can we explain in langsec terms ? Or how to fix this using langsec
> principles ?
>
n my opinion there are two problems:
(1) real-life PCRE is beyond regular; backreferences mess everything up. With
backreferences you can match all context-free and even some context-sensitive
languages. See
http://nikic.github.io/2012/06/15/The-true-power-of-regular-expressions.html
Therefore, the implementations have some hidden exponential time or space
issues.
(2) the actual implementation; see http://swtch.com/~rsc/regexp/
Strictly speaking about regular languages, the conversion from RE to executable
DFA leads to exponentially many states (space). To safe space and time for this
conversion, most implementations use backtracking and therefore shift the
problem to runtime. The assumption is that nondeterminism in practical REs is
limited but this does not hold — DoS becomes possible. Just look at
implementations and you will understand.
Cheers, Harald
_______________________________________________
langsec-discuss mailing list
[email protected]
https://mail.langsec.org/cgi-bin/mailman/listinfo/langsec-discuss