Hi Peff & Junio,

On Tue, 6 Sep 2016, Jeff King wrote:

> On Mon, Sep 05, 2016 at 12:10:11PM -0700, Junio C Hamano wrote:
> >  * We could use <ptr,len> variant of regexp engine as you proposed,
> >    which I think is a preferrable solution.  Do people know of a
> >    widely accepted implementation that we can throw into compat/ as
> >    fallback that is compatible with GPLv2?
> Maybe the one already in compat/regex? ;P

Indeed. That happens to be the implementation used by Git for Windows,

> I think re_search() the correct replacement function but it's been a
> while since I've looked into it.

The segfault I investigated happened in a call to strlen(). I see many
calls to strlen() in compat/regex/... The one that triggers the segfault
is in regexec(), compat/regex/regexec.c:241.

As to re_search(): I have not been able to reason about its callees in a
reasonable amount of time. I agree that they *should* not run over the
buffer, but I cannot easily verify it.

The bigger problem is that re_search() is defined in the __USE_GNU section
of regex.h, and I do not think it is appropriate to universally #define
said constant before #include'ing regex.h. So it would appear that major
surgery would be required if we wanted to use regular expressions on
strings that are not NUL-terminated.

So I agree that a better idea may be to simply ensure NUL-terminated
buffers when we require them, although that still might be tricky. More on
that in a reply to your comment to that end.


