On Wed, Oct 05, 2016 at 01:17:49PM +0200, Johannes Schindelin wrote:
> Hi Rich,
> 
> On Tue, 4 Oct 2016, Rich Felker wrote:
> 
> > On Tue, Oct 04, 2016 at 06:08:33PM +0200, Johannes Schindelin wrote:
> >
> > > And lastly, the best alternative would be to teach musl about
> > > REG_STARTEND, as it is rather useful a feature.
> > 
> > Maybe, but it seems fundamentally costly to support -- it's extra
> > state in the inner loops that imposes costly spill/reload on archs
> > with too few registers (x86).
> 
> It is true that it could cause that.
> 
> I had a brief look at the source code (you use backtracking...

Where did you get that idea? Backtracking is the most utterly
incompetent way to implement regex -- it throws away the whole
property that makes regex useful, being regular. Unfortunately, POSIX
BRE is not regular, as it contains backreferences, so any
implementation of regcomp/regexec requires at least a minimal
backtracking code path for BREs that contain backreferences.

> hopefully
> nobody uses musl to parse regular expressions from untrusted, or

On the contrary, musl's is the only system reccomp/regexec I'm aware
of that actually attempts to be safe with untrusted input -- when
using REG_EXTENDED (ERE). Other implementations provide backreferences
in ERE as an extension, making ERE unsafe just like BRE. musl
intentionally disallows them as a feature.

At least until recently, glibc also crashed on malloc failures in
regcomp, making it unsafe on untrusted input for that reason too.

Rich

Reply via email to