sorry for the late answer, I was really busy trying to come up with a new
and improved version of the patch series, and while hunting a bug I
introduced got bogged down with other tasks.
The good news is that I made up my mind about releasing a Git for Windows
v2.10.0(2): originally, I had planned to do that today, to have time for
any hot fixes until Sunday, if necessary, before going semi-dark.
FWIW I am now trying to track my plans for v2.10.0(2) (or v2.10.1, if
upstream Git v2.10.1 is released before) on GitHub:
On Tue, 6 Sep 2016, Jeff King wrote:
> On Tue, Sep 06, 2016 at 04:06:32PM +0200, Johannes Schindelin wrote:
> > > I think re_search() the correct replacement function but it's been a
> > > while since I've looked into it.
> > The segfault I investigated happened in a call to strlen(). I see many
> > calls to strlen() in compat/regex/... The one that triggers the segfault
> > is in regexec(), compat/regex/regexec.c:241.
> Yes, that is the important one, I think. The others are for patterns,
> error msgs, etc. Of course strlen() is not the only function that cares
> about NUL delimiters (and there might even be a "while (*p)" somewhere
> in the code).
> I always assumed the _point_ of re_search taking a ptr/len pair was
> exactly to handle this case. The documentation says:
> `string` is the string you want to match; it can contain newline and
> null characters. `size` is the length of that string.
> Which seems pretty definitive to me (that's for re_match(), but
> re_search() is defined in the docs in terms of re_match()).
Right. The problem is: I *really* want to avoid using GNU-isms.
> > The bigger problem is that re_search() is defined in the __USE_GNU section
> > of regex.h, and I do not think it is appropriate to universally #define
> > said constant before #include'ing regex.h. So it would appear that major
> > surgery would be required if we wanted to use regular expressions on
> > strings that are not NUL-terminated.
> We can contain this to the existing compat/regexec/regexec.c, and just
> provide a wrapper that is similar to regexec but takes a ptr/len pair.
But we can do even better than that: we can provide a wrapper that uses
REG_STARTEND where available (which is really the majority of platforms we
care about: Linux, MacOSX, Windows, and even the *BSDs). Where it is not
available, we simply malloc(), memcpy() and append a NUL.
Which is what my v2 does (will send it out in a moment).