On Tue, 6 Sep 2016, Jeff King wrote:
> On Mon, Sep 05, 2016 at 05:45:09PM +0200, Johannes Schindelin wrote:
> > Before calling regexec() on the file contents, we better be certain that
> > the strings fulfill the contract of C strings assumed by said function.
> If you have a buffer that is exactly "size" bytes and you are worried
> about regexec reading off the end, then...
> > diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> > index 55067ca..88820b6 100644
> > --- a/diffcore-pickaxe.c
> > +++ b/diffcore-pickaxe.c
> > @@ -49,6 +49,8 @@ static int diff_grep(mmfile_t *one, mmfile_t *two,
> > xpparam_t xpp;
> > xdemitconf_t xecfg;
> > + assert(!one || one->ptr[one->size] == '\0');
> > + assert(!two || two->ptr[two->size] == '\0');
> > if (!one)
> > return !regexec(regexp, two->ptr, 1, ®match, 0);
> ...don't your asserts also read off the end?
Yes, they would read off the end, *unless* a NUL was somehow appended to
> So you might still segfault, though you do catch a case where we have N
> bytes of junk before the end of the page (and you have a 255/256 chance
> of catching it).
Right. The assertion may fail, or a segfault happen. In both cases,
assumptions are violated and we need to fix the code.