Re: [PATCH 00/12] Clean up notes-related code around `load_subtree()`

2017-09-12 Thread Lars Schneider

> On 10 Sep 2017, at 09:39, Jeff King  wrote:
> 
> On Sun, Sep 10, 2017 at 06:45:08AM +0200, Michael Haggerty wrote:
> 
>>> So nothing to see here, but since I spent 20 minutes scratching my head
>>> (and I know others look at Coverity output and may scratch their heads
>>> too), I thought it was worth writing up. And also if I'm wrong, it would
>>> be good to know. ;)
>> 
>> Thanks for looking into this. I agree with your analysis.
>> 
>> I wonder whether it is the factor of two between path lengths and byte
>> lengths that is confusing Coverity. Perhaps the patch below would help.
>> It requires an extra, superfluous, check, but perhaps makes the code a
>> tad more readable. I'm neutral on whether we would want to make the change.
> 
> Yeah, I do agree that it makes the code's assumptions a bit easier to
> follow.
> 
>> Is there a way to ask Coverity whether a hypothetical change would
>> remove the warning, short of merging the change to master?
> 
> You can download and run the build portion of the coverity tools
> yourself. IIRC, that pushes the build up to their servers which then do
> the analysis (you can make your own "project", or use the existing "git"
> project -- I checked and you are already listed as an admin). I recall
> it being a minor pain to get it set up, but not too bad.
> 
> Stefan runs it against "pu" on a regular basis, which is where the
> emailed results come from. So just having Junio merge it to "pu" would
> be enough to get results.
> 
> I noticed that they now have some GitHub/Travis integration:
> 
>  https://scan.coverity.com/github
> 
> I'm not sure if that is new, or if we just didn't notice it before. ;)
> But that probably makes more sense to use than ad-hoc uploading (and
> maybe it would make it easy for you to test personal branches, too).

Coverity scans Git already:
https://scan.coverity.com/projects/70

I requested access to this Coverity project to integrate into our TravisCI
build.

- Lars

Re: [PATCH 00/12] Clean up notes-related code around `load_subtree()`

2017-09-12 Thread Michael Haggerty
On Sun, Sep 10, 2017 at 9:39 AM, Jeff King  wrote:
> On Sun, Sep 10, 2017 at 06:45:08AM +0200, Michael Haggerty wrote:
>
>> > So nothing to see here, but since I spent 20 minutes scratching my head
>> > (and I know others look at Coverity output and may scratch their heads
>> > too), I thought it was worth writing up. And also if I'm wrong, it would
>> > be good to know. ;)
>>
>> Thanks for looking into this. I agree with your analysis.
>>
>> I wonder whether it is the factor of two between path lengths and byte
>> lengths that is confusing Coverity. Perhaps the patch below would help.
>> It requires an extra, superfluous, check, but perhaps makes the code a
>> tad more readable. I'm neutral on whether we would want to make the change.
>
> Yeah, I do agree that it makes the code's assumptions a bit easier to
> follow.
>
>> Is there a way to ask Coverity whether a hypothetical change would
>> remove the warning, short of merging the change to master?
>
> You can download and run the build portion of the coverity tools
> yourself. [...]

Thanks for the info.

My suggested tweak doesn't appease Coverity. Given that, I don't think
I'll bother adding it to the patch series.

Michael


Re: [PATCH 00/12] Clean up notes-related code around `load_subtree()`

2017-09-10 Thread Jeff King
On Sun, Sep 10, 2017 at 06:45:08AM +0200, Michael Haggerty wrote:

> > So nothing to see here, but since I spent 20 minutes scratching my head
> > (and I know others look at Coverity output and may scratch their heads
> > too), I thought it was worth writing up. And also if I'm wrong, it would
> > be good to know. ;)
> 
> Thanks for looking into this. I agree with your analysis.
> 
> I wonder whether it is the factor of two between path lengths and byte
> lengths that is confusing Coverity. Perhaps the patch below would help.
> It requires an extra, superfluous, check, but perhaps makes the code a
> tad more readable. I'm neutral on whether we would want to make the change.

Yeah, I do agree that it makes the code's assumptions a bit easier to
follow.

> Is there a way to ask Coverity whether a hypothetical change would
> remove the warning, short of merging the change to master?

You can download and run the build portion of the coverity tools
yourself. IIRC, that pushes the build up to their servers which then do
the analysis (you can make your own "project", or use the existing "git"
project -- I checked and you are already listed as an admin). I recall
it being a minor pain to get it set up, but not too bad.

Stefan runs it against "pu" on a regular basis, which is where the
emailed results come from. So just having Junio merge it to "pu" would
be enough to get results.

I noticed that they now have some GitHub/Travis integration:

  https://scan.coverity.com/github

I'm not sure if that is new, or if we just didn't notice it before. ;)
But that probably makes more sense to use than ad-hoc uploading (and
maybe it would make it easy for you to test personal branches, too).

-Peff


Re: [PATCH 00/12] Clean up notes-related code around `load_subtree()`

2017-09-09 Thread Michael Haggerty
On 09/09/2017 12:31 PM, Jeff King wrote:
> On Sat, Aug 26, 2017 at 10:28:00AM +0200, Michael Haggerty wrote:
> 
>> It turns out that the comment is incorrect, but there was nevertheless
>> plenty that could be cleaned up in the area:
>>
>> * Make macro `GIT_NIBBLE` safer by adding some parentheses
>> * Remove some dead code
>> * Fix some memory leaks
>> * Fix some obsolete and incorrect comments
>> * Reject "notes" that are not blobs
>>
>> I hope the result is also easier to understand.
>>
>> This branch is also available from my Git fork [1] as branch
>> `load-subtree-cleanup`.
> 
> FYI, Coverity seems to complain about "pu" after this series is merged, but
> I think it's wrong.  It says:
> 
>   *** CID 1417630:  Memory - illegal accesses  (OVERRUN)
>   /notes.c: 458 in load_subtree()
>   452 
>   453 /*
>   454  * Pad the rest of the SHA-1 with zeros,
>   455  * except for the last byte, where we write
>   456  * the length:
>   457  */
>   >>> CID 1417630:  Memory - illegal accesses  (OVERRUN)
>   >>> Overrunning array of 20 bytes at byte offset 20 by dereferencing 
> pointer "_oid.hash[len]".
>   458 memset(object_oid.hash + len, 0, GIT_SHA1_RAWSZ 
> - len - 1);
>   459 object_oid.hash[KEY_INDEX] = (unsigned char)len;
>   460 
>   461 type = PTR_TYPE_SUBTREE;
>   462 } else {
>   463 /* This can't be part of a note */
> 
> I agree that if "len" were 20 here that would be a problem, but I don't
> think that's possible.
> 
> The tool correctly claims that prefix_len can be up to 19, due to the
> assert:
> 
>  3. cond_at_most: Checking prefix_len >= 20UL implies that prefix_len may 
> be up to 19 on the false branch.
>   420if (prefix_len >= GIT_SHA1_RAWSZ)
>   421BUG("prefix_len (%"PRIuMAX") is out of range", 
> (uintmax_t)prefix_len);
> 
> Then it claims:
> 
> 13. Condition path_len == 2 * (20 - prefix_len), taking false branch.
>   430if (path_len == 2 * (GIT_SHA1_RAWSZ - prefix_len)) {
>   431/* This is potentially the remainder of the 
> SHA-1 */
> 
> So we know that either prefix_len is not 19, or that path_len is not 2
> (since that combination would cause us to take the true branch here).
> But then it goes on to say:
> 
> 14. Condition path_len == 2, taking true branch.
>   442} else if (path_len == 2) {
>   443/* This is potentially an internal node */
> 
> which I believe must mean that prefix_len cannot be 19 here. And yet it
> says:
> 
> 15. assignment: Assigning: len = prefix_len. The value of len may now be 
> up to 19.
>   444size_t len = prefix_len;
>   445
>   [...]
>  17. incr: Incrementing len. The value of len may now be up to 20.
>  18. Condition hex_to_bytes(_oid.hash[len++], entry.path, 1), 
> taking false branch.
>   450if (hex_to_bytes(object_oid.hash + len++, 
> entry.path, 1))
>   451goto handle_non_note; /* entry.path is 
> not a SHA1 */
> 
> I think that's impossible, and Coverity simply isn't smart enough to
> shrink the set of possible values for prefix_len based on the set of
> if-else conditions.
> 
> So nothing to see here, but since I spent 20 minutes scratching my head
> (and I know others look at Coverity output and may scratch their heads
> too), I thought it was worth writing up. And also if I'm wrong, it would
> be good to know. ;)

Thanks for looking into this. I agree with your analysis.

I wonder whether it is the factor of two between path lengths and byte
lengths that is confusing Coverity. Perhaps the patch below would help.
It requires an extra, superfluous, check, but perhaps makes the code a
tad more readable. I'm neutral on whether we would want to make the change.

Is there a way to ask Coverity whether a hypothetical change would
remove the warning, short of merging the change to master?

Michael

diff --git a/notes.c b/notes.c
index 27d232f294..34f623f7b1 100644
--- a/notes.c
+++ b/notes.c
@@ -426,8 +426,14 @@ static void load_subtree(struct notes_tree *t,
struct leaf_node *subtree,
unsigned char type;
struct leaf_node *l;
size_t path_len = strlen(entry.path);
+   size_t path_bytes;

-   if (path_len == 2 * (GIT_SHA1_RAWSZ - prefix_len)) {
+   if (path_len % 2 != 0)
+   /* Path chunks must come in pairs of hex characters */
+   goto handle_non_note;
+
+   path_bytes = path_len / 2;
+   if (path_bytes == GIT_SHA1_RAWSZ - prefix_len) {
/* This is potentially the remainder of the SHA-1 */

if (!S_ISREG(entry.mode))

Re: [PATCH 00/12] Clean up notes-related code around `load_subtree()`

2017-09-09 Thread Jeff King
On Sat, Aug 26, 2017 at 10:28:00AM +0200, Michael Haggerty wrote:

> It turns out that the comment is incorrect, but there was nevertheless
> plenty that could be cleaned up in the area:
> 
> * Make macro `GIT_NIBBLE` safer by adding some parentheses
> * Remove some dead code
> * Fix some memory leaks
> * Fix some obsolete and incorrect comments
> * Reject "notes" that are not blobs
> 
> I hope the result is also easier to understand.
> 
> This branch is also available from my Git fork [1] as branch
> `load-subtree-cleanup`.

FYI, Coverity seems to complain about "pu" after this series is merged, but
I think it's wrong.  It says:

  *** CID 1417630:  Memory - illegal accesses  (OVERRUN)
  /notes.c: 458 in load_subtree()
  452 
  453   /*
  454* Pad the rest of the SHA-1 with zeros,
  455* except for the last byte, where we write
  456* the length:
  457*/
  >>> CID 1417630:  Memory - illegal accesses  (OVERRUN)
  >>> Overrunning array of 20 bytes at byte offset 20 by dereferencing 
pointer "_oid.hash[len]".
  458   memset(object_oid.hash + len, 0, GIT_SHA1_RAWSZ 
- len - 1);
  459   object_oid.hash[KEY_INDEX] = (unsigned char)len;
  460 
  461   type = PTR_TYPE_SUBTREE;
  462   } else {
  463   /* This can't be part of a note */

I agree that if "len" were 20 here that would be a problem, but I don't
think that's possible.

The tool correctly claims that prefix_len can be up to 19, due to the
assert:

 3. cond_at_most: Checking prefix_len >= 20UL implies that prefix_len may 
be up to 19 on the false branch.
  420if (prefix_len >= GIT_SHA1_RAWSZ)
  421BUG("prefix_len (%"PRIuMAX") is out of range", 
(uintmax_t)prefix_len);

Then it claims:

13. Condition path_len == 2 * (20 - prefix_len), taking false branch.
  430if (path_len == 2 * (GIT_SHA1_RAWSZ - prefix_len)) {
  431/* This is potentially the remainder of the SHA-1 
*/

So we know that either prefix_len is not 19, or that path_len is not 2
(since that combination would cause us to take the true branch here).
But then it goes on to say:

14. Condition path_len == 2, taking true branch.
  442} else if (path_len == 2) {
  443/* This is potentially an internal node */

which I believe must mean that prefix_len cannot be 19 here. And yet it
says:

15. assignment: Assigning: len = prefix_len. The value of len may now be up 
to 19.
  444size_t len = prefix_len;
  445
  [...]
 17. incr: Incrementing len. The value of len may now be up to 20.
 18. Condition hex_to_bytes(_oid.hash[len++], entry.path, 1), taking 
false branch.
  450if (hex_to_bytes(object_oid.hash + len++, 
entry.path, 1))
  451goto handle_non_note; /* entry.path is not 
a SHA1 */

I think that's impossible, and Coverity simply isn't smart enough to
shrink the set of possible values for prefix_len based on the set of
if-else conditions.

So nothing to see here, but since I spent 20 minutes scratching my head
(and I know others look at Coverity output and may scratch their heads
too), I thought it was worth writing up. And also if I'm wrong, it would
be good to know. ;)

-Peff


Re: [PATCH 00/12] Clean up notes-related code around `load_subtree()`

2017-08-26 Thread Johan Herland
On Sat, Aug 26, 2017 at 10:28 AM, Michael Haggerty  wrote:
[...]
> plenty that could be cleaned up in the area:
>
> * Make macro `GIT_NIBBLE` safer by adding some parentheses
> * Remove some dead code
> * Fix some memory leaks
> * Fix some obsolete and incorrect comments
> * Reject "notes" that are not blobs
>
> I hope the result is also easier to understand.

I looked through the series, and the patches look good to me, although
I do agree with Junio's comments on #2.

Thanks for a long-overdue cleanup in one of the hairier parts of
the notes code. The end result reads a lot better IMHO.

...Johan


[PATCH 00/12] Clean up notes-related code around `load_subtree()`

2017-08-26 Thread Michael Haggerty
While putzing around in the notes code quite some time ago, I found
this comment:

/*
 * Determine full path for this non-note entry:
 * The filename is already found in entry.path, but the
 * directory part of the path must be deduced from the subtree
 * containing this entry. We assume here that the overall notes
 * tree follows a strict byte-based progressive fanout
 * structure (i.e. using 2/38, 2/2/36, etc. fanouts, and not
 * e.g. 4/36 fanout). This means that if a non-note is found at
 * path "dead/beef", the following code will register it as
 * being found on "de/ad/beef".
 * On the other hand, if you use such non-obvious non-note
 * paths in the middle of a notes tree, you deserve what's
 * coming to you ;). Note that for non-notes that are not
 * SHA1-like at the top level, there will be no problems.
 *
 * To conclude, it is strongly advised to make sure non-notes
 * have at least one non-hex character in the top-level path
 * component.
 */

This was enough of a nerd snipe to get me to dig into the code.

It turns out that the comment is incorrect, but there was nevertheless
plenty that could be cleaned up in the area:

* Make macro `GIT_NIBBLE` safer by adding some parentheses
* Remove some dead code
* Fix some memory leaks
* Fix some obsolete and incorrect comments
* Reject "notes" that are not blobs

I hope the result is also easier to understand.

This branch is also available from my Git fork [1] as branch
`load-subtree-cleanup`.

Michael

[1] https://github.com/mhagger/git

Michael Haggerty (12):
  notes: make GET_NIBBLE macro more robust
  load_subtree(): remove unnecessary conditional
  load_subtree(): reduce the scope of some local variables
  load_subtree(): fix incorrect comment
  load_subtree(): separate logic for internal vs. terminal entries
  load_subtree(): check earlier whether an internal node is a tree entry
  load_subtree(): only consider blobs to be potential notes
  get_oid_hex_segment(): return 0 on success
  load_subtree(): combine some common code
  get_oid_hex_segment(): don't pad the rest of `oid`
  hex_to_bytes(): simpler replacement for `get_oid_hex_segment()`
  load_subtree(): declare some variables to be `size_t`

 notes.c | 136 +++-
 1 file changed, 66 insertions(+), 70 deletions(-)

-- 
2.11.0