Re: [PATCH] refs: make sure we never pass NULL to hashcpy
Michael Haggertywrites: > So `ref_transaction_update()` *does* need to set or clear the `HAVE_NEW` > and `HAVE_OLD` bits as I sketched, to impedance-match between the two > conventions. OK, so ignoring HAVE_NEW/HAVE_OLD bits that the callers of ref_transaction_update() may set in flags, and having ref_transaction_update() compute these bits based on new/old_sha1 pointers from scratch, would be the right thing to do. IOW flags &= ~(REF_HAVE_NEW|REF_HAVE_OLD); if (new_sha1) flags |= REF_HAVE_NEW; if (old_sha1) flags |= REF_HAVE_OLD; and your earlier "Does the warning go away if you change the line to" does essentially the same thing. > It's a shame how much time we've wasted discussing this. Maybe the code > is trying to be too clever/efficient and needs a rethink. It might be the case, but I do not know what to blame is "the two conventions", an over-eager compiler, or a confused commenter on the thread (that's me), though ;-).
Re: [PATCH] refs: make sure we never pass NULL to hashcpy
On 09/08/2017 02:46 AM, Junio C Hamano wrote: > Michael Haggertywrites: > >> I did just realize one thing: `ref_transaction_update()` takes `flags` >> as an argument and alters it using >> >>> flags |= (new_sha1 ? REF_HAVE_NEW : 0) | (old_sha1 ? REF_HAVE_OLD : >>> 0); >> >> Perhaps gcc is *more* intelligent than we give it credit for, and is >> actually worried that the `flags` argument passed in by the caller >> might *already* have one of these bits set. In that case >> `ref_transaction_add_update()` would indeed be called incorrectly. >> Does the warning go away if you change that line to >> >>> if (new_sha1) >>> flags |=REF_HAVE_NEW; >>> else >>> flags &= ~REF_HAVE_NEW; >>> if (old_sha1) >>> flags |=REF_HAVE_OLD; >>> else >>> flags &= ~REF_HAVE_OLD; >> >> ? This might be a nice change to have anyway, to isolate >> `ref_transaction_update()` from mistakes by its callers. > > I understand "drop HAVE_NEW bit if new_sha1 is NULL" part, but not > the other side "add HAVE_NEW if new_SHA1 is not NULL"---doesn't the > NEW/OLD flag exist exactly because some callers pass the address of > an embedded oid.hash[] or null_sha1, instead of NULL, when one side > does not exist? So new|old being NULL is a definite signal that we > need to drop HAVE_NEW|OLD, but the reverse may not be true, no? Is > it OK to overwrite null_sha1[] that is passed from some codepaths? > > ref_transaction_create and _delete pass null_sha1 on the missing > side, while ref_transaction_verify passes NULL, while calling > _update(). Should this distinction affect how _add_update() gets > called? There are two functions under discussion: * `ref_transaction_add_update()` is the low-level, private function that uses the `HAVE_{NEW,OLD}` bits to decide what to do. * `ref_transaction_update()` (like `ref_transaction_{create,delete,verify}()`) are public functions that ignore the `HAVE_{NEW,OLD}` bits and base their behavior on whether `new_sha1` and `old_sha1` are NULL. Each of these functions has to support three possibilities for its SHA-1 arguments: 1. The SHA-1 is provided and not `null_sha1`—in this case it must match the old value (if `old_sha1`) or it is the value to be set as the new value (if `new_sha1`). 2. The SHA-1 is provided and is equal to `null_sha1`—in this case the reference must not already exist (if `old_sha1` is `null_sha1`) or it will be deleted (if `new_sha1` is `null_sha1`). 3. The SHA-1 is not provided at all—in this case the old value is ignored (if `old_sha1` is not provided) or the reference is left unchanged (if `new_sha1` is not provided). Much of the current confusion stems because `ref_transaction_add_update()` encodes the third condition using the `REF_HAVE_*` bits, whereas `ref_transaction_update()` and its friends encode the third condition by setting `old_sha1` or `new_sha1` to `NULL`. So `ref_transaction_update()` *does* need to set or clear the `HAVE_NEW` and `HAVE_OLD` bits as I sketched, to impedance-match between the two conventions. It's a shame how much time we've wasted discussing this. Maybe the code is trying to be too clever/efficient and needs a rethink. Michael
Re: [PATCH] refs: make sure we never pass NULL to hashcpy
Michael Haggertywrites: > I did just realize one thing: `ref_transaction_update()` takes `flags` > as an argument and alters it using > >> flags |= (new_sha1 ? REF_HAVE_NEW : 0) | (old_sha1 ? REF_HAVE_OLD : >> 0); > > Perhaps gcc is *more* intelligent than we give it credit for, and is > actually worried that the `flags` argument passed in by the caller > might *already* have one of these bits set. In that case > `ref_transaction_add_update()` would indeed be called incorrectly. > Does the warning go away if you change that line to > >> if (new_sha1) >> flags |=REF_HAVE_NEW; >> else >> flags &= ~REF_HAVE_NEW; >> if (old_sha1) >> flags |=REF_HAVE_OLD; >> else >> flags &= ~REF_HAVE_OLD; > > ? This might be a nice change to have anyway, to isolate > `ref_transaction_update()` from mistakes by its callers. I understand "drop HAVE_NEW bit if new_sha1 is NULL" part, but not the other side "add HAVE_NEW if new_SHA1 is not NULL"---doesn't the NEW/OLD flag exist exactly because some callers pass the address of an embedded oid.hash[] or null_sha1, instead of NULL, when one side does not exist? So new|old being NULL is a definite signal that we need to drop HAVE_NEW|OLD, but the reverse may not be true, no? Is it OK to overwrite null_sha1[] that is passed from some codepaths? ref_transaction_create and _delete pass null_sha1 on the missing side, while ref_transaction_verify passes NULL, while calling _update(). Should this distinction affect how _add_update() gets called?
Re: [PATCH] refs: make sure we never pass NULL to hashcpy
On 09/07, Michael Haggerty wrote: > On Wed, Sep 6, 2017 at 3:26 AM, Junio C Hamanowrote: > > Thomas Gummerer writes: > > > >> gcc on arch linux (version 7.1.1) warns that a NULL argument is passed > >> as the second parameter of memcpy. > >> [...] > > > > It is hugely annoying to see a halfway-intelligent compiler forces > > you to add such pointless asserts. > > > > The only way the compiler could error on this is by inferring the > > fact that new_sha1/old_sha1 could be NULL by looking at the callsite > > in ref_transaction_update() where these are used as conditionals to > > set HAVE_NEW/HAVE_OLD that are passed. Even if the compiler were > > doing the whole-program analysis, the other two callsites of the > > function pass the address of oid.hash[] in an oid structure so it > > should know these won't be NULL. > > > > [...] > > > > I wonder if REF_HAVE_NEW/REF_HAVE_OLD are really needed in these > > codepaths, though. Perhaps we can instead declare !!new_sha1 means > > we have the new side and rewrite the above part to > > > > if (new_sha1) > > hashcpy(update->new_oid.hash, new_sha1); > > > > without an extra and totally pointless assert()? > > The ultimate reason for those flags is that `struct ref_update` embeds > `new_oid` and `old_oid` directly in the struct, so there is no way to > set it to "NULL". (The `is_null_sha1` value is used for a different > purpose.) So those flags keep track of whether the corresponding value > is specified or absent. > > Four of the five callers of `ref_transaction_add_update()` are > constructing a new `ref_update` from an old one. They currently don't > have to look into `flags`; they just pass it on (possibly changing a > bit or two). Implementing your proposal would oblige those callers to > change from something like Thanks for the explanation! > > new_update = ref_transaction_add_update( > > transaction, "HEAD", > > update->flags | REF_LOG_ONLY | REF_NODEREF, > > update->new_oid.hash, update->old_oid.hash, > > update->msg); > > to > > > new_update = ref_transaction_add_update( > > transaction, "HEAD", > > update->flags | REF_LOG_ONLY | REF_NODEREF, > > (update->flags & REF_HAVE_NEW) ? update->new_oid.hash : NULL, > > (update->flags & REF_HAVE_OLD) ? update->old_oid.hash : NULL, > > update->msg); > > It's not the end of the world, but it's annoying. > `ref_transaction_add_update()` was meant to be a low-level, > low-overhead way of allocating a `struct ref_update` and add it to a > transaction. > > Another solution (also annoying, but maybe a tad less so) would be to > change the one iffy caller, `ref_transaction_update()`, to pass in a > pointer to the null OID for `new_sha1` and `old_sha1` when the > corresponding flags are turned off. That value would never be looked > at, but it would hopefully reassure gcc. > > I did just realize one thing: `ref_transaction_update()` takes `flags` > as an argument and alters it using > > > flags |= (new_sha1 ? REF_HAVE_NEW : 0) | (old_sha1 ? REF_HAVE_OLD : > > 0); > > Perhaps gcc is *more* intelligent than we give it credit for, and is > actually worried that the `flags` argument passed in by the caller > might *already* have one of these bits set. In that case > `ref_transaction_add_update()` would indeed be called incorrectly. > Does the warning go away if you change that line to > > > if (new_sha1) > > flags |=REF_HAVE_NEW; > > else > > flags &= ~REF_HAVE_NEW; > > if (old_sha1) > > flags |=REF_HAVE_OLD; > > else > > flags &= ~REF_HAVE_OLD; > > ? Indeed that fixes it, great catch! gcc is indeed smarter than we gave it credit for, this makes a lot of sense. Interestingly stripping away the flags fixes the compiler warning: diff --git a/refs.c b/refs.c index ba22f4acef..2e6871beac 100644 --- a/refs.c +++ b/refs.c @@ -921,6 +921,9 @@ int ref_transaction_update(struct ref_transaction *transaction, return -1; } + flags &= ~REF_HAVE_NEW; + flags &= ~REF_HAVE_OLD; + flags |= (new_sha1 ? REF_HAVE_NEW : 0) | (old_sha1 ? REF_HAVE_OLD : 0); ref_transaction_add_update(transaction, refname, flags, while checking that the flags are not there in the first place still leaves the compiler warning (whether I use "die()" or just "return -1" doesn't matter in this case): diff --git a/refs.c b/refs.c index ba22f4acef..62ff283755 100644 --- a/refs.c +++ b/refs.c @@ -921,6 +921,9 @@ int ref_transaction_update(struct ref_transaction *transaction, return -1; } + if ((flags & REF_HAVE_NEW) != 0 || (flags & REF_HAVE_OLD) != 0) + die("BUG: passed invalid flag to ref_transaction_update"); + flags |= (new_sha1 ? REF_HAVE_NEW : 0) | (old_sha1 ? REF_HAVE_OLD : 0);
Re: [PATCH] refs: make sure we never pass NULL to hashcpy
On Wed, Sep 6, 2017 at 3:26 AM, Junio C Hamanowrote: > Thomas Gummerer writes: > >> gcc on arch linux (version 7.1.1) warns that a NULL argument is passed >> as the second parameter of memcpy. >> [...] > > It is hugely annoying to see a halfway-intelligent compiler forces > you to add such pointless asserts. > > The only way the compiler could error on this is by inferring the > fact that new_sha1/old_sha1 could be NULL by looking at the callsite > in ref_transaction_update() where these are used as conditionals to > set HAVE_NEW/HAVE_OLD that are passed. Even if the compiler were > doing the whole-program analysis, the other two callsites of the > function pass the address of oid.hash[] in an oid structure so it > should know these won't be NULL. > > [...] > > I wonder if REF_HAVE_NEW/REF_HAVE_OLD are really needed in these > codepaths, though. Perhaps we can instead declare !!new_sha1 means > we have the new side and rewrite the above part to > > if (new_sha1) > hashcpy(update->new_oid.hash, new_sha1); > > without an extra and totally pointless assert()? The ultimate reason for those flags is that `struct ref_update` embeds `new_oid` and `old_oid` directly in the struct, so there is no way to set it to "NULL". (The `is_null_sha1` value is used for a different purpose.) So those flags keep track of whether the corresponding value is specified or absent. Four of the five callers of `ref_transaction_add_update()` are constructing a new `ref_update` from an old one. They currently don't have to look into `flags`; they just pass it on (possibly changing a bit or two). Implementing your proposal would oblige those callers to change from something like > new_update = ref_transaction_add_update( > transaction, "HEAD", > update->flags | REF_LOG_ONLY | REF_NODEREF, > update->new_oid.hash, update->old_oid.hash, > update->msg); to > new_update = ref_transaction_add_update( > transaction, "HEAD", > update->flags | REF_LOG_ONLY | REF_NODEREF, > (update->flags & REF_HAVE_NEW) ? update->new_oid.hash : NULL, > (update->flags & REF_HAVE_OLD) ? update->old_oid.hash : NULL, > update->msg); It's not the end of the world, but it's annoying. `ref_transaction_add_update()` was meant to be a low-level, low-overhead way of allocating a `struct ref_update` and add it to a transaction. Another solution (also annoying, but maybe a tad less so) would be to change the one iffy caller, `ref_transaction_update()`, to pass in a pointer to the null OID for `new_sha1` and `old_sha1` when the corresponding flags are turned off. That value would never be looked at, but it would hopefully reassure gcc. I did just realize one thing: `ref_transaction_update()` takes `flags` as an argument and alters it using > flags |= (new_sha1 ? REF_HAVE_NEW : 0) | (old_sha1 ? REF_HAVE_OLD : > 0); Perhaps gcc is *more* intelligent than we give it credit for, and is actually worried that the `flags` argument passed in by the caller might *already* have one of these bits set. In that case `ref_transaction_add_update()` would indeed be called incorrectly. Does the warning go away if you change that line to > if (new_sha1) > flags |=REF_HAVE_NEW; > else > flags &= ~REF_HAVE_NEW; > if (old_sha1) > flags |=REF_HAVE_OLD; > else > flags &= ~REF_HAVE_OLD; ? This might be a nice change to have anyway, to isolate `ref_transaction_update()` from mistakes by its callers. For that matter, one might want to be even more selective about what bits are allowed in the `flags` argument to `ref_transaction_update()`'s callers: > flags &= REF_ALLOWED_FLAGS; /* value would need to be determined */ Michael
Re: [PATCH] refs: make sure we never pass NULL to hashcpy
On Wed, Sep 6, 2017 at 2:26 AM, Junio C Hamanowrote: > Thomas Gummerer writes: > >> gcc on arch linux (version 7.1.1) warns that a NULL argument is passed >> as the second parameter of memcpy. >> >> In file included from refs.c:5:0: >> refs.c: In function ‘ref_transaction_verify’: >> cache.h:948:2: error: argument 2 null where non-null expected >> [-Werror=nonnull] >> memcpy(sha_dst, sha_src, GIT_SHA1_RAWSZ); >> ^~~~ >> In file included from git-compat-util.h:165:0, >> from cache.h:4, >> from refs.c:5: >> /usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared >> here >> extern void *memcpy (void *__restrict __dest, const void *__restrict __src, >> ^~ >> ... >> diff --git a/refs.c b/refs.c >> index ba22f4acef..d8c12a9c44 100644 >> --- a/refs.c >> +++ b/refs.c >> @@ -896,10 +896,14 @@ struct ref_update *ref_transaction_add_update( >> >> update->flags = flags; >> >> - if (flags & REF_HAVE_NEW) >> + if (flags & REF_HAVE_NEW) { >> + assert(new_sha1); >> hashcpy(update->new_oid.hash, new_sha1); >> - if (flags & REF_HAVE_OLD) >> + } >> + if (flags & REF_HAVE_OLD) { >> + assert(old_sha1); >> hashcpy(update->old_oid.hash, old_sha1); >> + } > > It is hugely annoying to see a halfway-intelligent compiler forces > you to add such pointless asserts. > > The only way the compiler could error on this is by inferring the > fact that new_sha1/old_sha1 could be NULL by looking at the callsite > in ref_transaction_update() where these are used as conditionals to > set HAVE_NEW/HAVE_OLD that are passed. Even if the compiler were > doing the whole-program analysis, the other two callsites of the > function pass the address of oid.hash[] in an oid structure so it > should know these won't be NULL. > > Or is the compiler being really dumb and triggering an error for > every use of > > memcpy(dst, src, size); > > that must now be written as > > assert(src); > memcpy(dst, src, size); > > ??? That would be doubly annoying No, I think it can't quite deal with the flags that are passed in. I'm on a different machine today, so I can't actually check, but that's what I would expect at least. > I wonder if REF_HAVE_NEW/REF_HAVE_OLD are really needed in these > codepaths, though. Perhaps we can instead declare !!new_sha1 means > we have the new side and rewrite the above part to > > if (new_sha1) > hashcpy(update->new_oid.hash, new_sha1); > > without an extra and totally pointless assert()? Yeah, that seems much nicer. I'll try that and send a new a patch (though I won't get to it before tomorrow). Thanks for the review. >> update->msg = xstrdup_or_null(msg); >> return update; >> }
Re: [PATCH] refs: make sure we never pass NULL to hashcpy
Thomas Gummererwrites: > gcc on arch linux (version 7.1.1) warns that a NULL argument is passed > as the second parameter of memcpy. > > In file included from refs.c:5:0: > refs.c: In function ‘ref_transaction_verify’: > cache.h:948:2: error: argument 2 null where non-null expected > [-Werror=nonnull] > memcpy(sha_dst, sha_src, GIT_SHA1_RAWSZ); > ^~~~ > In file included from git-compat-util.h:165:0, > from cache.h:4, > from refs.c:5: > /usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared > here > extern void *memcpy (void *__restrict __dest, const void *__restrict __src, > ^~ > ... > diff --git a/refs.c b/refs.c > index ba22f4acef..d8c12a9c44 100644 > --- a/refs.c > +++ b/refs.c > @@ -896,10 +896,14 @@ struct ref_update *ref_transaction_add_update( > > update->flags = flags; > > - if (flags & REF_HAVE_NEW) > + if (flags & REF_HAVE_NEW) { > + assert(new_sha1); > hashcpy(update->new_oid.hash, new_sha1); > - if (flags & REF_HAVE_OLD) > + } > + if (flags & REF_HAVE_OLD) { > + assert(old_sha1); > hashcpy(update->old_oid.hash, old_sha1); > + } It is hugely annoying to see a halfway-intelligent compiler forces you to add such pointless asserts. The only way the compiler could error on this is by inferring the fact that new_sha1/old_sha1 could be NULL by looking at the callsite in ref_transaction_update() where these are used as conditionals to set HAVE_NEW/HAVE_OLD that are passed. Even if the compiler were doing the whole-program analysis, the other two callsites of the function pass the address of oid.hash[] in an oid structure so it should know these won't be NULL. Or is the compiler being really dumb and triggering an error for every use of memcpy(dst, src, size); that must now be written as assert(src); memcpy(dst, src, size); ??? That would be doubly annoying. I wonder if REF_HAVE_NEW/REF_HAVE_OLD are really needed in these codepaths, though. Perhaps we can instead declare !!new_sha1 means we have the new side and rewrite the above part to if (new_sha1) hashcpy(update->new_oid.hash, new_sha1); without an extra and totally pointless assert()? > update->msg = xstrdup_or_null(msg); > return update; > }
[PATCH] refs: make sure we never pass NULL to hashcpy
gcc on arch linux (version 7.1.1) warns that a NULL argument is passed as the second parameter of memcpy. In file included from refs.c:5:0: refs.c: In function ‘ref_transaction_verify’: cache.h:948:2: error: argument 2 null where non-null expected [-Werror=nonnull] memcpy(sha_dst, sha_src, GIT_SHA1_RAWSZ); ^~~~ In file included from git-compat-util.h:165:0, from cache.h:4, from refs.c:5: /usr/include/string.h:43:14: note: in a call to function ‘memcpy’ declared here extern void *memcpy (void *__restrict __dest, const void *__restrict __src, ^~ Tracking this error down, we can track it back to ref_transaction_add_update. where the call to hashcpy is however protected by the flags that are passed in. To make sure there's no code path where the wrong flags are passed in, and to help the compiler realize that no NULL parameter is passed as second argument to hashcpy, add asserts that this is indeed the case. Signed-off-by: Thomas Gummerer--- This is based on top of ma/ts-cleanups, as that fixes another compiler warning with gcc 7.1.1. refs.c | 8 ++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/refs.c b/refs.c index ba22f4acef..d8c12a9c44 100644 --- a/refs.c +++ b/refs.c @@ -896,10 +896,14 @@ struct ref_update *ref_transaction_add_update( update->flags = flags; - if (flags & REF_HAVE_NEW) + if (flags & REF_HAVE_NEW) { + assert(new_sha1); hashcpy(update->new_oid.hash, new_sha1); - if (flags & REF_HAVE_OLD) + } + if (flags & REF_HAVE_OLD) { + assert(old_sha1); hashcpy(update->old_oid.hash, old_sha1); + } update->msg = xstrdup_or_null(msg); return update; } -- 2.14.1.480.gb18f417b89