Hi,
while studying martijn@'s pending regexec(3) patch, i found
a read-access one-byte buffer underflow in "case OBOL" in the
function backref(), file libc/regex/engine.c.
I think outright bugs (like crashes) ought to be fixed before
improving functionality. So i'd like to get this in first, then
return to checking martijn@'s improvements. In particular since
this doesn't conflict with martijn@'s patch, even though it touches
closely related code.
The following is required to trigger it:
* REG_BASIC
* REG_NOTBOL
* The regular expression must contain at least one backreference.
* The regular expression must start with at least one opening
parenthesis starting a subexpression.
* The first elementary atom in the expression must be '^'.
It must be at the beginning of a subexpression
that is quantified by '*'.
That may seem a bit contrived at first, but with REG_NEWLINE, there
might even be practical use cases for patterns starting with something
similar to "\(^foo\)*", continuing with something else, then using
a backreference - for example to capture lines containing the
"something else" bracketed with lines containing variations of foo.
Even if such stuff is contrived, i don't think we want regexec(3)
to ever segfault. Besides, doing explicit index checks before
backing up a byte in a buffer helps code auditors, and the regex
guts are already next door to auditor's hell.
I'm appending a test program demonstrating the segfault
and a patch fixing it.
OK?
Ingo
----- 8< ----- schnipp ----- >8 ----- 8< ----- schnapp ----- >8 -----
#include <sys/types.h>
#include <err.h>
#include <regex.h>
#include <stdlib.h>
int
main(void)
{
regex_t re;
char *buf;
if (regcomp(&re, "\\(^\\)*\\(x\\)\\2", REG_BASIC | REG_NEWLINE))
errx(1, "regcomp");
/*
* Allocate a huge buffer such that we get
* a guard page in front of it.
*/
if ((buf = malloc(64 * 1024)) == NULL)
err(1, NULL);
buf[0] = 'x';
buf[1] = 'x';
buf[2] = '\0';
/*
* Trigger the segfault in regex/engine.c,
* backref(), case OBOL.
*/
regexec(&re, buf, 0, NULL, REG_NOTBOL);
errx(1, "This is unexpected: regexec did not segfault.");
}
----- 8< ----- schnipp ----- >8 ----- 8< ----- schnapp ----- >8 -----
Index: engine.c
===================================================================
RCS file: /cvs/src/lib/libc/regex/engine.c,v
retrieving revision 1.19
diff -u -p -r1.19 engine.c
--- engine.c 28 Dec 2015 23:01:22 -0000 1.19
+++ engine.c 15 May 2016 15:18:22 -0000
@@ -506,9 +506,9 @@ backref(struct match *m, char *start, ch
return(NULL);
break;
case OBOL:
- if ( (sp == m->beginp && !(m->eflags®_NOTBOL)) ||
- (sp < m->endp && *(sp-1) == '\n' &&
- (m->g->cflags®_NEWLINE)) )
+ if ((sp == m->beginp && !(m->eflags®_NOTBOL)) ||
+ (sp > m->offp && sp < m->endp &&
+ *(sp-1) == '\n' && (m->g->cflags®_NEWLINE)))
{ /* yes */ }
else
return(NULL);