Re: [PATCH 0/9] introducing oideq()

2018-08-28 Thread brian m. carlson
On Tue, Aug 28, 2018 at 05:21:27PM -0400, Jeff King wrote:
> On Sun, Aug 26, 2018 at 08:56:21PM +, brian m. carlson wrote:
> > I would quite like to see this series picked up for v2.20.  If we want
> > to minimize performance regressions with the SHA-256 work, I think it's
> > important.
> 
> Thanks. One of the things I was worried about was causing unnecessary
> conflicts with existing topics, including your work. But if everybody is
> on board, I'd be happy to see this go in early in the next release cycle
> (the longer we wait, the more annoying conflicts Junio has to resolve).

I can send out work that doesn't conflict with this while it makes its
way to master.  There are a lot of test fixes that can go in in the mean
time.

I will be sending out a series that actually implements SHA-256 as an
RFC soon, but I don't think there will be any conflicts (and it will
just be an RFC anyway).
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


Re: [PATCH 0/9] introducing oideq()

2018-08-28 Thread Derrick Stolee

On 8/28/2018 5:21 PM, Jeff King wrote:

On Sun, Aug 26, 2018 at 08:56:21PM +, brian m. carlson wrote:


Due to the simplicity of the current code and our inlining, the compiler
can usually figure this out for now. So I wouldn't expect this patch to
actually improve performance right away. But as that discussion shows,
we are likely to take a performance hit as we move to more runtime
determination of the_hash_algo parameters. Having these callers use the
more strict form will potentially help us recover that.

So in that sense we _could_ simply punt on this series until then (and
it's certainly post-v2.19 material). But I think it's worth doing now,
simply from a readability/annotation standpoint. IMHO the resulting code
is more clear (though I've long since taught myself to read !foocmp() as
equality).

I would quite like to see this series picked up for v2.20.  If we want
to minimize performance regressions with the SHA-256 work, I think it's
important.

Thanks. One of the things I was worried about was causing unnecessary
conflicts with existing topics, including your work. But if everybody is
on board, I'd be happy to see this go in early in the next release cycle
(the longer we wait, the more annoying conflicts Junio has to resolve).


I'm happy to take this change whenever. In my opinion, right after 
v2.19.0 is cut would be a great time to merge it into master.


This v2 is good.

Reviewed-by: Derrick Stolee 



Re: [PATCH 0/9] introducing oideq()

2018-08-28 Thread Jeff King
On Sun, Aug 26, 2018 at 08:56:21PM +, brian m. carlson wrote:

> > Due to the simplicity of the current code and our inlining, the compiler
> > can usually figure this out for now. So I wouldn't expect this patch to
> > actually improve performance right away. But as that discussion shows,
> > we are likely to take a performance hit as we move to more runtime
> > determination of the_hash_algo parameters. Having these callers use the
> > more strict form will potentially help us recover that.
> > 
> > So in that sense we _could_ simply punt on this series until then (and
> > it's certainly post-v2.19 material). But I think it's worth doing now,
> > simply from a readability/annotation standpoint. IMHO the resulting code
> > is more clear (though I've long since taught myself to read !foocmp() as
> > equality).
> 
> I would quite like to see this series picked up for v2.20.  If we want
> to minimize performance regressions with the SHA-256 work, I think it's
> important.

Thanks. One of the things I was worried about was causing unnecessary
conflicts with existing topics, including your work. But if everybody is
on board, I'd be happy to see this go in early in the next release cycle
(the longer we wait, the more annoying conflicts Junio has to resolve).

> Applying the following patch on top of this series causes gcc to inline
> both branches, which is pretty much the best we can expect.  I haven't
> benchmarked it, though, so I can't say what the actual performance
> consequence is.
> [...]
> diff --git a/cache.h b/cache.h
> index 3bb88ac6d0..1c182c6ef6 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -1033,7 +1033,10 @@ static inline int oidcmp(const struct object_id *oid1, 
> const struct object_id *o
>  
>  static inline int hasheq(const unsigned char *sha1, const unsigned char 
> *sha2)
>  {
> - return !hashcmp(sha1, sha2);
> + if (the_hash_algo->rawsz == 32) {
> + return !memcmp(sha1, sha2, 32);
> + }
> + return !memcmp(sha1, sha2, 20);
>  }

Yeah, I can reproduce that, too.  I'm actually surprised that's enough
to convince the compiler to inline, based on my earlier tests. I wonder
if it is because it is more able to see the boolean nature of the
memcmp() (i.e., in the earlier versions it was not willing to carry that
information across the conditional boundary for some reason). Or perhaps
it is that this is written slightly differently from my earlier case,
where the fallback did not say "32" explicitly, but rather:

  return memcmp(sha1, sha2, the_hash_algo->rawsz);

For your form, we'd probably want to add a third case arm with a BUG(),
just in case.

> As for this series itself, other than the typos people have pointed out,
> it looks good to me.

Thanks. Here it is again with those typos fixed, but no changes to the
patches themselves.

 1:  7bcf611959 =  1:  7bcf611959 coccinelle: use <...> for function exclusion
 2:  6025ebe5d1 !  2:  0e90944f5a introduce hasheq() and oideq()
@@ -5,11 +5,10 @@
 The main comparison functions we provide for comparing
 object ids are hashcmp() and oidcmp(). These are more
 flexible than a strict equality check, since they also
-express ordering. That makes them them useful for sorting
-and binary searching. However, it also makes them
-potentially slower than a strict equality check. Consider
-this C code, which is traditionally what our hashcmp has
-looked like:
+express ordering. That makes them useful for sorting and
+binary searching. However, it also makes them potentially
+slower than a strict equality check. Consider this C code,
+which is traditionally what our hashcmp has looked like:
 
   #include 
   int hashcmp(const unsigned char *a, const unsigned char *b)
@@ -51,7 +50,7 @@
 though the vast majority don't.
 
 We can solve that by introducing a hasheq() function (and
-matching oideq() wrapper), which callers can use to make
+matching oideq() wrapper), which callers can use to make it
 clear that they only care about equality. For now, the
 implementation will literally be "!hashcmp()", but it frees
 us up later to introduce code optimized specifically for the
 3:  35a05a7e5c =  3:  4f1ea17e79 convert "oidcmp() == 0" to oideq()
 4:  b18fc3994d =  4:  6614cc71e4 convert "hashcmp() == 0" to hasheq()
 5:  0fbcfcf7cc =  5:  c3dd2b2c80 convert "oidcmp() != 0" to "!oideq()"
 6:  32e169b886 =  6:  c94914f1ef convert "hashcmp() != 0" to "!hasheq()"
 7:  b90051c38b =  7:  c773fa9ca4 convert hashmap comparison functions to 
oideq()
 8:  b1e6ad80f0 =  8:  ada5809fab read-cache: use oideq() in ce_compare 
functions
 9:  18be86e078 !  9:  62f24d63a4 show_dirstat: simplify same-content check
@@ -2,7 +2,7 @@
 
 show_dirstat: simplify same-content check
 
-We two nested conditionals to store a content_changed
+We use two 

Re: [PATCH 0/9] introducing oideq()

2018-08-27 Thread Derrick Stolee

On 8/26/2018 4:56 PM, brian m. carlson wrote:

On Sat, Aug 25, 2018 at 04:00:31AM -0400, Jeff King wrote:

This is a follow-up to the discussion in:

   https://public-inbox.org/git/20180822030344.ga14...@sigill.intra.peff.net/

The general idea is that the majority of callers don't care about actual
plus/minus ordering from oidcmp() and hashcmp(); they just want to know
if two oids are equal. The stricter equality check can be optimized
better by the compiler.

Due to the simplicity of the current code and our inlining, the compiler
can usually figure this out for now. So I wouldn't expect this patch to
actually improve performance right away. But as that discussion shows,
we are likely to take a performance hit as we move to more runtime
determination of the_hash_algo parameters. Having these callers use the
more strict form will potentially help us recover that.

So in that sense we _could_ simply punt on this series until then (and
it's certainly post-v2.19 material). But I think it's worth doing now,
simply from a readability/annotation standpoint. IMHO the resulting code
is more clear (though I've long since taught myself to read !foocmp() as
equality).

I would quite like to see this series picked up for v2.20.  If we want
to minimize performance regressions with the SHA-256 work, I think it's
important.


Seconded.


Applying the following patch on top of this series causes gcc to inline
both branches, which is pretty much the best we can expect.  I haven't
benchmarked it, though, so I can't say what the actual performance
consequence is.


We should definitely measure the cost here, but when we have the option 
to move to newhash, that cost should be acceptable.


Thanks,

-Stolee



Re: [PATCH 0/9] introducing oideq()

2018-08-26 Thread brian m. carlson
On Sat, Aug 25, 2018 at 04:00:31AM -0400, Jeff King wrote:
> This is a follow-up to the discussion in:
> 
>   https://public-inbox.org/git/20180822030344.ga14...@sigill.intra.peff.net/
> 
> The general idea is that the majority of callers don't care about actual
> plus/minus ordering from oidcmp() and hashcmp(); they just want to know
> if two oids are equal. The stricter equality check can be optimized
> better by the compiler.
> 
> Due to the simplicity of the current code and our inlining, the compiler
> can usually figure this out for now. So I wouldn't expect this patch to
> actually improve performance right away. But as that discussion shows,
> we are likely to take a performance hit as we move to more runtime
> determination of the_hash_algo parameters. Having these callers use the
> more strict form will potentially help us recover that.
> 
> So in that sense we _could_ simply punt on this series until then (and
> it's certainly post-v2.19 material). But I think it's worth doing now,
> simply from a readability/annotation standpoint. IMHO the resulting code
> is more clear (though I've long since taught myself to read !foocmp() as
> equality).

I would quite like to see this series picked up for v2.20.  If we want
to minimize performance regressions with the SHA-256 work, I think it's
important.

Applying the following patch on top of this series causes gcc to inline
both branches, which is pretty much the best we can expect.  I haven't
benchmarked it, though, so I can't say what the actual performance
consequence is.

As for this series itself, other than the typos people have pointed out,
it looks good to me.

diff --git a/cache.h b/cache.h
index 3bb88ac6d0..1c182c6ef6 100644
--- a/cache.h
+++ b/cache.h
@@ -1033,7 +1033,10 @@ static inline int oidcmp(const struct object_id *oid1, 
const struct object_id *o
 
 static inline int hasheq(const unsigned char *sha1, const unsigned char *sha2)
 {
-   return !hashcmp(sha1, sha2);
+   if (the_hash_algo->rawsz == 32) {
+   return !memcmp(sha1, sha2, 32);
+   }
+   return !memcmp(sha1, sha2, 20);
 }
 
 static inline int oideq(const struct object_id *oid1, const struct object_id 
*oid2)
-- 
brian m. carlson: Houston, Texas, US
OpenPGP: https://keybase.io/bk2204


signature.asc
Description: PGP signature


[PATCH 0/9] introducing oideq()

2018-08-25 Thread Jeff King
This is a follow-up to the discussion in:

  https://public-inbox.org/git/20180822030344.ga14...@sigill.intra.peff.net/

The general idea is that the majority of callers don't care about actual
plus/minus ordering from oidcmp() and hashcmp(); they just want to know
if two oids are equal. The stricter equality check can be optimized
better by the compiler.

Due to the simplicity of the current code and our inlining, the compiler
can usually figure this out for now. So I wouldn't expect this patch to
actually improve performance right away. But as that discussion shows,
we are likely to take a performance hit as we move to more runtime
determination of the_hash_algo parameters. Having these callers use the
more strict form will potentially help us recover that.

So in that sense we _could_ simply punt on this series until then (and
it's certainly post-v2.19 material). But I think it's worth doing now,
simply from a readability/annotation standpoint. IMHO the resulting code
is more clear (though I've long since taught myself to read !foocmp() as
equality).

If we do punt, patch 1 could still be picked up now, as it's a related
cleanup that stands on its own.

The diffstat is scary, but the vast majority of that is purely
mechanical via coccinelle. There's some discussion in the individual
patches.

We also don't _have_ to convert all sites now. I strongly suspect that
only a few calls are actual measurable bottlenecks (the one in
lookup_object() is the one I was primarily interested in).  But it's
just as easy to do it all at once with coccinelle, rather than try to
measure each one (and once we add the coccinelle patches, we have to
convert everything, or "make coccicheck" will nag people to do so).

I didn't bother to hold back changes for any topics in flight.  There
are a handful of conflicts with "pu", but they're mostly trivial.  The
only big one is due to some code movement in ds/reachable. But though
the conflict is big, the resolution is still easy (you can even just
take the other side and "make coccicheck" to do it automatically).

  [1/9]: coccinelle: use <...> for function exclusion
  [2/9]: introduce hasheq() and oideq()
  [3/9]: convert "oidcmp() == 0" to oideq()
  [4/9]: convert "hashcmp() == 0" to hasheq()
  [5/9]: convert "oidcmp() != 0" to "!oideq()"
  [6/9]: convert "hashcmp() != 0" to "!hasheq()"
  [7/9]: convert hashmap comparison functions to oideq()
  [8/9]: read-cache: use oideq() in ce_compare functions
  [9/9]: show_dirstat: simplify same-content check

 bisect.c   |  6 ++--
 blame.c|  8 ++---
 builtin/am.c   |  2 +-
 builtin/checkout.c |  2 +-
 builtin/describe.c | 10 +++---
 builtin/diff.c |  2 +-
 builtin/difftool.c |  4 +--
 builtin/fast-export.c  |  2 +-
 builtin/fetch.c|  6 ++--
 builtin/fmt-merge-msg.c|  6 ++--
 builtin/index-pack.c   |  8 ++---
 builtin/log.c  |  6 ++--
 builtin/merge-tree.c   |  2 +-
 builtin/merge.c|  6 ++--
 builtin/pack-objects.c |  4 +--
 builtin/pull.c |  4 +--
 builtin/receive-pack.c |  4 +--
 builtin/remote.c   |  2 +-
 builtin/replace.c  |  6 ++--
 builtin/rm.c   |  2 +-
 builtin/show-branch.c  |  6 ++--
 builtin/tag.c  |  2 +-
 builtin/unpack-objects.c   |  4 +--
 builtin/update-index.c |  4 +--
 bulk-checkin.c |  2 +-
 bundle.c   |  2 +-
 cache-tree.c   |  2 +-
 cache.h| 22 +
 combine-diff.c |  4 +--
 commit-graph.c | 10 +++---
 commit.c   |  2 +-
 connect.c  |  2 +-
 contrib/coccinelle/commit.cocci|  4 +--
 contrib/coccinelle/object_id.cocci | 50 --
 diff-lib.c |  4 +--
 diff.c | 23 ++
 diffcore-break.c   |  2 +-
 diffcore-rename.c  |  2 +-
 dir.c  |  6 ++--
 fast-import.c  | 10 +++---
 fetch-pack.c   |  2 +-
 http-push.c|  2 +-
 http-walker.c  |  4 +--
 http.c |  2 +-
 log-tree.c |  6 ++--
 match-trees.c  |  2 +-
 merge-recursive.c  |  4 +--
 notes-merge.c  | 24 +++---
 notes.c|  8 ++---
 object.c   |  2 +-
 oidmap.c   | 12 +++
 pack-check.c   |  6 ++--
 pack-objects.c |  2 +-
 pack-write.c