Hi Peff & Junio,
On Tue, 6 Sep 2016, Jeff King wrote:
> On Mon, Sep 05, 2016 at 12:10:11PM -0700, Junio C Hamano wrote:
> > * We could use <ptr,len> variant of regexp engine as you proposed,
> > which I think is a preferrable solution. Do people know of a
> > widely accepted implementation that we can throw into compat/ as
> > fallback that is compatible with GPLv2?
> Maybe the one already in compat/regex? ;P
Indeed. That happens to be the implementation used by Git for Windows,
> I think re_search() the correct replacement function but it's been a
> while since I've looked into it.
The segfault I investigated happened in a call to strlen(). I see many
calls to strlen() in compat/regex/... The one that triggers the segfault
is in regexec(), compat/regex/regexec.c:241.
As to re_search(): I have not been able to reason about its callees in a
reasonable amount of time. I agree that they *should* not run over the
buffer, but I cannot easily verify it.
The bigger problem is that re_search() is defined in the __USE_GNU section
of regex.h, and I do not think it is appropriate to universally #define
said constant before #include'ing regex.h. So it would appear that major
surgery would be required if we wanted to use regular expressions on
strings that are not NUL-terminated.
So I agree that a better idea may be to simply ensure NUL-terminated
buffers when we require them, although that still might be tricky. More on
that in a reply to your comment to that end.