Re: [PATCH] regex: do not call `regfree()` if compilation fails

2018-05-22 Thread Martin Ågren
On 22 May 2018 at 04:20, Eric Sunshine  wrote:
> On Mon, May 21, 2018 at 2:43 PM, Stefan Beller  wrote:
>> On Sun, May 20, 2018 at 3:50 AM, Martin Ågren  wrote:
>>> It is apparently undefined behavior to call `regfree()` on a regex where
>>> `regcomp()` failed. [...]
>>>
>>> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
>>> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char 
>>> *needle, int cflags)
>>> /* The POSIX.2 people are surely sick */
>>> char errbuf[1024];
>>> regerror(err, regex, errbuf, 1024);
>>> -   regfree(regex);
>>> die("invalid regex: %s", errbuf);
>>
>> While the commit message is very clear why we supposedly introduce a leak 
>> here,
>> it is hard to be found from the source code (as we only delete code
>> there, so digging
>> for history is not obvious), so maybe
>>
>>  /* regfree(regex) is invalid here */
>>
>> instead?
>
> The commit message doesn't say that we are supposedly introducing a
> leak (and, indeed, no resources should have been allocated to the
> 'regex' upon failed compile). It's saying that removing this call
> potentially avoids a crash under some implementations.
>
> Given that the very next line is die(), and that the function name has
> "_or_die" in it, I'm not sure that an in-code comment about regfree()
> being invalid upon failed compile would be all that helpful; indeed,
> it could be confusing, causing the reader to wonder why that is
> significant if we're just dying anyhow. I find that the patch, as is,
> clarifies rather than muddles the situation.

Like Eric, I feel that the possible leak here is somewhat uninteresting,
since the next line will die. That said, I seem to recall from my
grepping around earlier that we have other users where we return
with a failure instead of dying.

Any clarifying comments in such code would be a separate patch to me. I
also do not immediately see the need for adding such a comment in those
places. We can do that once we verify that we actually do leak (I would
expect that to happen only in some implementations, and I think there is
a fair chance that we will never encounter such an implementation.)

Martin


Re: [PATCH] regex: do not call `regfree()` if compilation fails

2018-05-21 Thread Eric Sunshine
On Mon, May 21, 2018 at 2:43 PM, Stefan Beller  wrote:
> On Sun, May 20, 2018 at 3:50 AM, Martin Ågren  wrote:
>> It is apparently undefined behavior to call `regfree()` on a regex where
>> `regcomp()` failed. [...]
>>
>> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
>> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char 
>> *needle, int cflags)
>> /* The POSIX.2 people are surely sick */
>> char errbuf[1024];
>> regerror(err, regex, errbuf, 1024);
>> -   regfree(regex);
>> die("invalid regex: %s", errbuf);
>
> While the commit message is very clear why we supposedly introduce a leak 
> here,
> it is hard to be found from the source code (as we only delete code
> there, so digging
> for history is not obvious), so maybe
>
>  /* regfree(regex) is invalid here */
>
> instead?

The commit message doesn't say that we are supposedly introducing a
leak (and, indeed, no resources should have been allocated to the
'regex' upon failed compile). It's saying that removing this call
potentially avoids a crash under some implementations.

Given that the very next line is die(), and that the function name has
"_or_die" in it, I'm not sure that an in-code comment about regfree()
being invalid upon failed compile would be all that helpful; indeed,
it could be confusing, causing the reader to wonder why that is
significant if we're just dying anyhow. I find that the patch, as is,
clarifies rather than muddles the situation.


Re: [PATCH] regex: do not call `regfree()` if compilation fails

2018-05-21 Thread Stefan Beller
On Sun, May 20, 2018 at 3:50 AM, Martin Ågren  wrote:
> It is apparently undefined behavior to call `regfree()` on a regex where
> `regcomp()` failed. The language in [1] is a bit muddy, at least to me,
> but the clearest hint is this (`preg` is the `regex_t *`):
>
> Upon successful completion, the regcomp() function shall return 0.
> Otherwise, it shall return an integer value indicating an error as
> described in , and the content of preg is undefined.
>
> Funnily enough, there is also the `regerror()` function which should be
> given a pointer to such a "failed" `regex_t` -- the content of which
> would supposedly be undefined -- and which may investigate it to come up
> with a detailed error message.
>
> In any case, the example in that document shows how `regfree()` is not
> called after `regcomp()` fails.
>
> We have quite a few users of this API and most get this right. These
> three users do not.
>
> Several implementations can handle this just fine [2] and these code paths
> supposedly have not wreaked havoc or we'd have heard about it. (These
> are all in code paths where git got bad input and is just about to die
> anyway.) But let's just avoid the issue altogether.
>
> [1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html
>
> [2] https://www.redhat.com/archives/libvir-list/2013-September/msg00262.html
>
> Researched-by: Eric Sunshine 
> Signed-off-byi Martin Ågren 
> ---
>
> On 14 May 2018 at 05:05, Eric Sunshine  wrote:
>> My research (for instance [1,2]) seems to indicate that it would be
>> better to avoid regfree() upon failed regcomp(), even though such a
>> situation is handled sanely in some implementations.
>>
>> [1]: https://www.redhat.com/archives/libvir-list/2013-September/msg00276.html
>> [2]: https://www.redhat.com/archives/libvir-list/2013-September/msg00273.html
>
> Thank you for researching this. I think it would make sense to get rid
> of the few places we have where we get this wrong (if our understanding
> of this being undefined is right).
>
>  diffcore-pickaxe.c | 1 -
>  grep.c | 2 --
>  2 files changed, 3 deletions(-)
>
> diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
> index 239ce5122b..800a899c86 100644
> --- a/diffcore-pickaxe.c
> +++ b/diffcore-pickaxe.c
> @@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char 
> *needle, int cflags)
> /* The POSIX.2 people are surely sick */
> char errbuf[1024];
> regerror(err, regex, errbuf, 1024);
> -   regfree(regex);

While the commit message is very clear why we supposedly introduce a leak here,
it is hard to be found from the source code (as we only delete code
there, so digging
for history is not obvious), so maybe

 /* regfree(regex) is invalid here */

instead?

> die("invalid regex: %s", errbuf);
> }
>  }
> diff --git a/grep.c b/grep.c
> index 65b90c10a3..5e4f3f9a9d 100644
> --- a/grep.c
> +++ b/grep.c
> @@ -636,7 +636,6 @@ static void compile_fixed_regexp(struct grep_pat *p, 
> struct grep_opt *opt)
> if (err) {
> char errbuf[1024];
> regerror(err, >regexp, errbuf, sizeof(errbuf));
> -   regfree(>regexp);
> compile_regexp_failed(p, errbuf);
> }
>  }
> @@ -701,7 +700,6 @@ static void compile_regexp(struct grep_pat *p, struct 
> grep_opt *opt)
> if (err) {
> char errbuf[1024];
> regerror(err, >regexp, errbuf, 1024);
> -   regfree(>regexp);
> compile_regexp_failed(p, errbuf);
> }
>  }
> --
> 2.17.0.840.g5d83f92caf
>


[PATCH] regex: do not call `regfree()` if compilation fails

2018-05-20 Thread Martin Ågren
It is apparently undefined behavior to call `regfree()` on a regex where
`regcomp()` failed. The language in [1] is a bit muddy, at least to me,
but the clearest hint is this (`preg` is the `regex_t *`):

Upon successful completion, the regcomp() function shall return 0.
Otherwise, it shall return an integer value indicating an error as
described in , and the content of preg is undefined.

Funnily enough, there is also the `regerror()` function which should be
given a pointer to such a "failed" `regex_t` -- the content of which
would supposedly be undefined -- and which may investigate it to come up
with a detailed error message.

In any case, the example in that document shows how `regfree()` is not
called after `regcomp()` fails.

We have quite a few users of this API and most get this right. These
three users do not.

Several implementations can handle this just fine [2] and these code paths
supposedly have not wreaked havoc or we'd have heard about it. (These
are all in code paths where git got bad input and is just about to die
anyway.) But let's just avoid the issue altogether.

[1] http://pubs.opengroup.org/onlinepubs/9699919799/functions/regcomp.html

[2] https://www.redhat.com/archives/libvir-list/2013-September/msg00262.html

Researched-by: Eric Sunshine 
Signed-off-byi Martin Ågren 
---

On 14 May 2018 at 05:05, Eric Sunshine  wrote:
> My research (for instance [1,2]) seems to indicate that it would be
> better to avoid regfree() upon failed regcomp(), even though such a
> situation is handled sanely in some implementations.
>
> [1]: https://www.redhat.com/archives/libvir-list/2013-September/msg00276.html
> [2]: https://www.redhat.com/archives/libvir-list/2013-September/msg00273.html

Thank you for researching this. I think it would make sense to get rid
of the few places we have where we get this wrong (if our understanding
of this being undefined is right).

 diffcore-pickaxe.c | 1 -
 grep.c | 2 --
 2 files changed, 3 deletions(-)

diff --git a/diffcore-pickaxe.c b/diffcore-pickaxe.c
index 239ce5122b..800a899c86 100644
--- a/diffcore-pickaxe.c
+++ b/diffcore-pickaxe.c
@@ -215,7 +215,6 @@ static void regcomp_or_die(regex_t *regex, const char 
*needle, int cflags)
/* The POSIX.2 people are surely sick */
char errbuf[1024];
regerror(err, regex, errbuf, 1024);
-   regfree(regex);
die("invalid regex: %s", errbuf);
}
 }
diff --git a/grep.c b/grep.c
index 65b90c10a3..5e4f3f9a9d 100644
--- a/grep.c
+++ b/grep.c
@@ -636,7 +636,6 @@ static void compile_fixed_regexp(struct grep_pat *p, struct 
grep_opt *opt)
if (err) {
char errbuf[1024];
regerror(err, >regexp, errbuf, sizeof(errbuf));
-   regfree(>regexp);
compile_regexp_failed(p, errbuf);
}
 }
@@ -701,7 +700,6 @@ static void compile_regexp(struct grep_pat *p, struct 
grep_opt *opt)
if (err) {
char errbuf[1024];
regerror(err, >regexp, errbuf, 1024);
-   regfree(>regexp);
compile_regexp_failed(p, errbuf);
}
 }
-- 
2.17.0.840.g5d83f92caf