Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-07 Thread Jeff King
On Mon, Jan 06, 2014 at 07:35:04PM -0800, Brodie Rao wrote:

 On Mon, Jan 6, 2014 at 7:32 PM, Brodie Rao bro...@sf.io wrote:
  This change ensures get_sha1_basic() doesn't try to resolve full hashes
  as refs when ambiguous ref warnings are disabled.
 
  This provides a substantial performance improvement when passing many
  hashes to a command (like git rev-list --stdin) when
  core.warnambiguousrefs is false. The check incurs 6 stat()s for every
  hash supplied, which can be costly over NFS.
 
 Forgot to add:
 
 Signed-off-by: Brodie Rao bro...@sf.io

Looks good to me.

I wonder if I should have simply gone this route instead of adding
warn_on_object_refname_ambiguity, and then people who want cat-file
--batch to be fast could just turn off core.warnAmbiguousRefs. I wanted
it to happen automatically, though. Alternatively, I guess cat-file
--batch could just turn off warn_ambiguous_refs itself.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-07 Thread Junio C Hamano
Brodie Rao bro...@sf.io writes:

 This change ensures get_sha1_basic() doesn't try to resolve full hashes
 as refs when ambiguous ref warnings are disabled.

 This provides a substantial performance improvement when passing many
 hashes to a command (like git rev-list --stdin) when
 core.warnambiguousrefs is false. The check incurs 6 stat()s for every
 hash supplied, which can be costly over NFS.
 ---

Needs sign-off.  The patch looks good.

Thanks.

  sha1_name.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/sha1_name.c b/sha1_name.c
 index e9c2999..10bd007 100644
 --- a/sha1_name.c
 +++ b/sha1_name.c
 @@ -451,9 +451,9 @@ static int get_sha1_basic(const char *str, int len, 
 unsigned char *sha1)
   int at, reflog_len, nth_prior = 0;
  
   if (len == 40  !get_sha1_hex(str, sha1)) {
 - if (warn_on_object_refname_ambiguity) {
 + if (warn_ambiguous_refs  warn_on_object_refname_ambiguity) {
   refs_found = dwim_ref(str, len, tmp_sha1, real_ref);
 - if (refs_found  0  warn_ambiguous_refs) {
 + if (refs_found  0) {
   warning(warn_msg, len, str);
   if (advice_object_name_warning)
   fprintf(stderr, %s\n, 
 _(object_name_msg));
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-07 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 Alternatively, I guess cat-file
 --batch could just turn off warn_ambiguous_refs itself.

Sounds like a sensible way to go, perhaps on top of this change?
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-07 Thread Jeff King
On Tue, Jan 07, 2014 at 09:51:07AM -0800, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  Alternatively, I guess cat-file
  --batch could just turn off warn_ambiguous_refs itself.
 
 Sounds like a sensible way to go, perhaps on top of this change?

The downside is that we would not warn about ambiguous refs anymore,
even if the user was expecting it to. I don't know if that matters much.
I kind of feel in the --batch situation that it is somewhat useless (I
wonder if rev-list --stdin should turn it off, too).

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-07 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Tue, Jan 07, 2014 at 09:51:07AM -0800, Junio C Hamano wrote:

 Jeff King p...@peff.net writes:
 
  Alternatively, I guess cat-file
  --batch could just turn off warn_ambiguous_refs itself.
 
 Sounds like a sensible way to go, perhaps on top of this change?

 The downside is that we would not warn about ambiguous refs anymore,
 even if the user was expecting it to. I don't know if that matters much.

That is true already with or without Brodie's change, isn't it?
With warn_on_object_refname_ambiguity, cat-file --batch makes us
ignore core.warnambigousrefs setting.  If we redo 25fba78d
(cat-file: disable object/refname ambiguity check for batch mode,
2013-07-12) to unconditionally disable warn_ambiguous_refs in
cat-file --batch and get rid of warn_on_object_refname_ambiguity,
the end result would be the same, no?

 I kind of feel in the --batch situation that it is somewhat useless (I
 wonder if rev-list --stdin should turn it off, too).

I think doing the same as cat-file --batch in rev-list --stdin
makes sense.  Both interfaces are designed to grok extended SHA-1s,
and full 40-hex object names could be ambiguous and we are missing
the warning for them.

Or are you wondering if we should revert 25fba78d, apply Brodie's
change to skip the ref resolution whose result is never used, and
tell people who want to use cat-file --batch (or rev-list
--stdin) to disable the ambiguity warning themselves?


--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-07 Thread Jeff King
On Tue, Jan 07, 2014 at 11:38:15AM -0800, Junio C Hamano wrote:

   Alternatively, I guess cat-file
   --batch could just turn off warn_ambiguous_refs itself.
  
  Sounds like a sensible way to go, perhaps on top of this change?
 
  The downside is that we would not warn about ambiguous refs anymore,
  even if the user was expecting it to. I don't know if that matters much.
 
 That is true already with or without Brodie's change, isn't it?
 With warn_on_object_refname_ambiguity, cat-file --batch makes us
 ignore core.warnambigousrefs setting.  If we redo 25fba78d
 (cat-file: disable object/refname ambiguity check for batch mode,
 2013-07-12) to unconditionally disable warn_ambiguous_refs in
 cat-file --batch and get rid of warn_on_object_refname_ambiguity,
 the end result would be the same, no?

No, I don't think the end effect is the same (or maybe we are not
talking about the same thing. :) ).

There are two ambiguity situations:

  1. Ambiguous non-fully-qualified refs (e.g., same tag and head name).

  2. 40-hex sha1 object names which might also be unqualified ref names.

Prior to 25ffba78d, cat-file checked both (like all the rest of git).
But checking (2) is very expensive, since otherwise a 40-hex sha1 does
not need to do a ref lookup at all, and something like rev-list
--objects | cat-file --batch-check processes a large number of these.

Detecting (1) is not nearly as expensive. You must already be doing a
ref lookup to trigger it (so the relative cost is much closer), and your
query size is bounded by the number of refs, not the number of objects.

Commit 25ffba78d traded off some safety for a lot of performance by
disabling (2), but left (1) in place because the tradeoff is different.

The two options I was musing over earlier today were (all on top of
Brodie's patch):

  a. Revert 25ffba78d. With Brodie's patch, core.warnAmbiguousRefs
 disables _both_ warnings. So we default to safe-but-slow, but
 people who care about performance can turn off ambiguity warnings.
 The downside is that you have to know to turn it off manually (and
 it's a global config flag, so you end up turning it off
 _everywhere_, not just in big queries where it matters).

  b. Revert 25ffba78d, but then on top of it just turn off
 warn_ambiguous_refs unconditionally in cat-file --batch-check.
 The downside is that we drop the safety from (1). The upside is
 that the code is a little simpler, as we drop the extra flag.

And obviously:

  c. Just leave it at Brodie's patch with nothing else on top.

My thinking in favor of (b) was basically does anybody actually care
about ambiguous refs in this situation anyway?. If they do, then I
think (c) is my preferred choice.

  I kind of feel in the --batch situation that it is somewhat useless (I
  wonder if rev-list --stdin should turn it off, too).
 
 I think doing the same as cat-file --batch in rev-list --stdin
 makes sense.  Both interfaces are designed to grok extended SHA-1s,
 and full 40-hex object names could be ambiguous and we are missing
 the warning for them.

I'm not sure I understand what you are saying. We _do_ have the warning
for rev-list --stdin currently. We do _not_ have the warning for
cat-file --batch, since my 25ffba78d. I was wondering if rev-list
should go the same way as 25ffba78d, for efficiency reasons (e.g., think
piping to rev-list --no-walk --stdin).

 Or are you wondering if we should revert 25fba78d, apply Brodie's
 change to skip the ref resolution whose result is never used, and
 tell people who want to use cat-file --batch (or rev-list
 --stdin) to disable the ambiguity warning themselves?

See above. :)

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-07 Thread Junio C Hamano
Jeff King p...@peff.net writes:

 On Tue, Jan 07, 2014 at 11:38:15AM -0800, Junio C Hamano wrote:

   Alternatively, I guess cat-file
   --batch could just turn off warn_ambiguous_refs itself.
  
  Sounds like a sensible way to go, perhaps on top of this change?
 
  The downside is that we would not warn about ambiguous refs anymore,
  even if the user was expecting it to. I don't know if that matters much.
 
 That is true already with or without Brodie's change, isn't it?
 With warn_on_object_refname_ambiguity, cat-file --batch makes us
 ignore core.warnambigousrefs setting.  If we redo 25fba78d
 (cat-file: disable object/refname ambiguity check for batch mode,
 2013-07-12) to unconditionally disable warn_ambiguous_refs in
 cat-file --batch and get rid of warn_on_object_refname_ambiguity,
 the end result would be the same, no?

 No, I don't think the end effect is the same (or maybe we are not
 talking about the same thing. :) ).

 There are two ambiguity situations:

   1. Ambiguous non-fully-qualified refs (e.g., same tag and head name).

   2. 40-hex sha1 object names which might also be unqualified ref names.

 Prior to 25ffba78d, cat-file checked both (like all the rest of git).
 But checking (2) is very expensive,...

Ahh, of course.  Sorry for forgetting about 1.

 The two options I was musing over earlier today were (all on top of
 Brodie's patch):

   a. Revert 25ffba78d. With Brodie's patch, core.warnAmbiguousRefs
  disables _both_ warnings. So we default to safe-but-slow, but
  people who care about performance can turn off ambiguity warnings.
  The downside is that you have to know to turn it off manually (and
  it's a global config flag, so you end up turning it off
  _everywhere_, not just in big queries where it matters).

Or git -c core.warnambiguousrefs=false cat-file --batch, but I
think a more important point is that it is no longer automatic for
known-to-be-heavy operations, and I agree with you that it is a
downside.

   b. Revert 25ffba78d, but then on top of it just turn off
  warn_ambiguous_refs unconditionally in cat-file --batch-check.
  The downside is that we drop the safety from (1). The upside is
  that the code is a little simpler, as we drop the extra flag.

 And obviously:

   c. Just leave it at Brodie's patch with nothing else on top.

 My thinking in favor of (b) was basically does anybody actually care
 about ambiguous refs in this situation anyway?. If they do, then I
 think (c) is my preferred choice.

OK.  I agree with that line of thinking.  Let's take it one step at
a time, i.e. do c. and also use warn_on_object_refname_ambiguity in
rev-list --stdin first and leave the simplification (i.e. b.) for
later.

  I kind of feel in the --batch situation that it is somewhat useless (I
  wonder if rev-list --stdin should turn it off, too).
 
 I think doing the same as cat-file --batch in rev-list --stdin
 makes sense.  Both interfaces are designed to grok extended SHA-1s,
 and full 40-hex object names could be ambiguous and we are missing
 the warning for them.

 I'm not sure I understand what you are saying. We _do_ have the warning
 for rev-list --stdin currently. We do _not_ have the warning for
 cat-file --batch, since my 25ffba78d.

What I wanted to say was that we would be discarding the safety for
rev-list --stdin with the same argument as we did for cat-file
--batch.  If the argument for the earlier cat-file --batch were
this interface only takes raw 40-hex object names, then the
situation would have been different, but that is not the case.

 I was wondering if rev-list should go the same way as 25ffba78d,
 for efficiency reasons (e.g., think piping to rev-list --no-walk
 --stdin).

Yes, and I was trying to agree with that, but apparently I failed
;-)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-07 Thread Jeff King
On Tue, Jan 07, 2014 at 12:31:57PM -0800, Junio C Hamano wrote:

c. Just leave it at Brodie's patch with nothing else on top.
 
  My thinking in favor of (b) was basically does anybody actually care
  about ambiguous refs in this situation anyway?. If they do, then I
  think (c) is my preferred choice.
 
 OK.  I agree with that line of thinking.  Let's take it one step at
 a time, i.e. do c. and also use warn_on_object_refname_ambiguity in
 rev-list --stdin first and leave the simplification (i.e. b.) for
 later.

Here's a series to do that. The first three are just cleanups I noticed
while looking at the problem.

While I was writing the commit messages, though, I had a thought. Maybe
we could simply do the check faster for the common case that most refs
do not look like object names? Right now we blindly call dwim_ref for
each get_sha1 call, which is the expensive part. If we instead just
loaded all of the refnames from the dwim_ref location (basically heads,
tags and the top-level of refs/), we could build an index of all of
the entries matching the 40-hex pattern. In 99% of cases, this would be
zero entries, and the check would collapse to a simple integer
comparison (and even if we did have one, it would be a simple binary
search in memory).

Our index is more racy than actually checking the filesystem, but I
don't think it matters here.

Anyway, here is the series I came up with, in the meantime. I can take a
quick peek at just making it faster, too.

  [1/4]: cat-file: refactor error handling of batch_objects
  [2/4]: cat-file: fix a minor memory leak in batch_objects
  [3/4]: cat-file: restore ambiguity warning flag in batch_objects
  [4/4]: revision: turn off object/refname ambiguity check for --stdin

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] sha1_name: don't resolve refs when core.warnambiguousrefs is false

2014-01-06 Thread Brodie Rao
On Mon, Jan 6, 2014 at 7:32 PM, Brodie Rao bro...@sf.io wrote:
 This change ensures get_sha1_basic() doesn't try to resolve full hashes
 as refs when ambiguous ref warnings are disabled.

 This provides a substantial performance improvement when passing many
 hashes to a command (like git rev-list --stdin) when
 core.warnambiguousrefs is false. The check incurs 6 stat()s for every
 hash supplied, which can be costly over NFS.

Forgot to add:

Signed-off-by: Brodie Rao bro...@sf.io

 ---
  sha1_name.c | 4 ++--
  1 file changed, 2 insertions(+), 2 deletions(-)

 diff --git a/sha1_name.c b/sha1_name.c
 index e9c2999..10bd007 100644
 --- a/sha1_name.c
 +++ b/sha1_name.c
 @@ -451,9 +451,9 @@ static int get_sha1_basic(const char *str, int len, 
 unsigned char *sha1)
 int at, reflog_len, nth_prior = 0;

 if (len == 40  !get_sha1_hex(str, sha1)) {
 -   if (warn_on_object_refname_ambiguity) {
 +   if (warn_ambiguous_refs  warn_on_object_refname_ambiguity) {
 refs_found = dwim_ref(str, len, tmp_sha1, real_ref);
 -   if (refs_found  0  warn_ambiguous_refs) {
 +   if (refs_found  0) {
 warning(warn_msg, len, str);
 if (advice_object_name_warning)
 fprintf(stderr, %s\n, 
 _(object_name_msg));
 --
 1.8.3.4 (Apple Git-47)

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html