Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Jacob Keller
On Thu, Sep 29, 2016 at 10:19 AM, Junio C Hamano  wrote:
> Jeff King  writes:
>>   - "cat-file --batch-check" can show you the sha1 and type, but it
>> won't abbreviate sha1s, and it won't show you commit/tag information
>>
>>   - "log --stdin --no-walk" will format the commit however you like, but
>> skips the trees and blobs entirely, and the tag can only be seen via
>> "%d"
>>
>>   - "for-each-ref" has flexible formatting, too, but wants to format
>> refs, not objects (and doesn't read from stdin).
>
> - "name-rev" is used to give "describe --contains", and can read
>   from its standard input, but has no format customization.
>   Another downside of it is that it only wants to see
>   committishes.
>

Some tool which reads standard input and can be formatted would be
nice. Extending name-rev with the same format options as for-each-ref
would be nice.

Thanks,
Jake


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Junio C Hamano
Jeff King  writes:

>> $ git rev-parse --disambiguate-list=b2e1
>> b2e1196 tag v2.8.0-rc1
>> b2e11d1 tree
>> b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
>> b2e1759 blob
>> b2e18954 blob
>> b2e1895c blob
>
> I think the "right" way to do this is pipe the list of sha1s into
> another git commit which can format them however you want.
> Unfortunately, there isn't a single command that does a great job:
>
>   - "cat-file --batch-check" can show you the sha1 and type, but it
> won't abbreviate sha1s, and it won't show you commit/tag information
>
>   - "log --stdin --no-walk" will format the commit however you like, but
> skips the trees and blobs entirely, and the tag can only be seen via
> "%d"
>
>   - "for-each-ref" has flexible formatting, too, but wants to format
> refs, not objects (and doesn't read from stdin).

- "name-rev" is used to give "describe --contains", and can read
  from its standard input, but has no format customization.
  Another downside of it is that it only wants to see
  committishes.

> IMHO that is a sign that our formatting tools aren't as good as they
> could be (I think the right tool is cat-file, but it should be able to
> do all of the formatting that the other commands can do).
>
> Of course if you really just want human-readable output, then:
>
>   $ git cat-file -e b2e1
>   error: short SHA1 b2e1 is ambiguous
>   hint: The candidates are:
>   hint:   b2e1196 tag v2.8.0-rc1
>   hint:   b2e11d1 tree
>   hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
>   hint:   b2e1759 blob
>   hint:   b2e18954 blob
>   hint:   b2e1895c blob
>   fatal: Not a valid object name b2e1
>
> is pretty easy.

Yes.  I think adding this to rev-parse that is meant for machines is
probably a mistake, as this "hint" machinery's output will become
even more human friendly over time as we gain experience.

 - If the hypothetical "--disambiguate-list" option wants to produce
   machine parseable output for scripts, it would mean its output
   (and whatgver the reading script can do based on its output for
   humans) will become less useful for humans over time.

 - If the hypothetical "--disambiguate-list" option only wants to
   replicate the human readable output that is designed to be
   improved over time and expects its output _not_ to be interpreted
   by scripts but merely be relayed, then why aren't these scripts
   just invoking the commands that already gives the "hint:" output
   and showing that directly to humans in the first place?

> That being said, I don't mind if somebody wanted to do a rev-parse
> option on top of my series. The formatting code is already split into
> its own function.

So let's not go there.


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 07:36:27AM -0700, Kyle J. McKay wrote:

> On Sep 29, 2016, at 06:24, Jeff King wrote:
> 
> > > If you are doing "git show 235234" it should pick the tag (if it
> > > peels to a
> > > committish) because Git has already set a precedent of preferring
> > > tags over
> > > commits when it disambiguates ref names and otherwise pick the
> > > commit.
> > 
> > I'm not convinced that picking the tag is actually helpful in this case;
> > I agree with Linus that feeding something to "git show" almost always
> > wants to choose the commit.
> 
> Since "git show" peels tags you end up seeing the commit it refers to
> (assuming it's a committish tag).

Yes, but it's almost certainly _not_ the commit you meant. From your
example:

>c512b03:
>   c512b035556eff4d commit Merge branch 'rc/maint-reflog-msg-for-forced
>   c512b0344196931a tag(v0.99.9a) GIT 0.99.9a

If I'm looking for the commit c512b03, then it almost certainly isn't
v0.99.9a. That tag's commit is e634aec. Or another way of thinking about
it: you want to guess what the _writer_ of the note meant. Why would
somebody write "c512b03" when they could have written "v0.99.9a"? And
they certainly would not have written it if they meant "e634aec". :)

> > I also don't think tag ambiguity in short sha1s is all that interesting.
> 
> The Linux repository has this:
> 
>901069c:
>   901069c71415a76d commit iwlagn: change Copyright to 2011
>   901069c5c5b15532 tag(v2.6.38-rc4) Linux 2.6.38-rc4

Sure, I'm not surprised there's a collision. But I'd expect those to be
a tiny fraction of collisions. Here's the breakdown of object types in
my clone of linux.git:

  $ git cat-file --batch-all-objects --batch-check='%(objecttype)' |
sort | uniq -c
  1421198 blob
   618073 commit
  479 tag
  2877913 tree

That's a hundredth of a percent tag objects.  The chance that you have
_a_ 7-hex collision with a tag is relatively high. But the chance that
any given collision involves a tag is rather small.

-Peff


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Kyle J. McKay

On Sep 29, 2016, at 06:24, Jeff King wrote:

If you are doing "git show 235234" it should pick the tag (if it  
peels to a
committish) because Git has already set a precedent of preferring  
tags over
commits when it disambiguates ref names and otherwise pick the  
commit.


I'm not convinced that picking the tag is actually helpful in this  
case;

I agree with Linus that feeding something to "git show" almost always
wants to choose the commit.


Since "git show" peels tags you end up seeing the commit it refers to  
(assuming it's a committish tag).


I also don't think tag ambiguity in short sha1s is all that  
interesting.


The Linux repository has this:

   901069c:
  901069c71415a76d commit iwlagn: change Copyright to 2011
  901069c5c5b15532 tag(v2.6.38-rc4) Linux 2.6.38-rc4

Since that tag peels to a commit, it seems like it would be incorrect  
to pick the commit over the tag when you're looking for a committish.


Either 901069c should resolve to the tag (which gets peeled to the  
commit) or it should error out with the hint messages.


The Git repository has this:

   c512b03:
  c512b035556eff4d commit Merge branch 'rc/maint-reflog-msg-for- 
forced

  c512b0344196931a tag(v0.99.9a) GIT 0.99.9a

So perhaps it's a little bit more interesting than it first appears.  :)

--Kyle


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 06:01:51AM -0700, Kyle J. McKay wrote:

> But perhaps it makes sense to actually pick one if there's only one
> disambiguation of the type you're looking for.
> 
> For example given:
> 
> 235234a blob
> 2352347 tag
> 235234f tree
> 2352340 commit
> 
> If you are doing "git cat-file blob 235234" it should pick the blob and spit
> out a warning (and similarly for other cat-file types).  But "git cat-file
> -p 235234" would give the fatal error with the disambiguation hints because
> it wants type "any".

That code is already there; it's just a matter of whether git has enough
information to know the context. E.g. (in git.git):

  $ git show b2e11
  error: short SHA1 b2e11 is ambiguous
  hint: The candidates are:
  hint:   b2e1196 tag v2.8.0-rc1
  hint:   b2e11d1 tree
  ...

  $ git log b2e11
  commit ab5d01a29eb7380ceab070f0807c2939849c44bc (tag: v2.8.0-rc1)
  ...

The "show" command can show anything, but "log" really wants
committishes, so it's able to disambiguate. It looks like cat-file never
learned to feed its context, but it's probably something like this:

diff --git a/builtin/cat-file.c b/builtin/cat-file.c
index 94e67eb..ecbb959 100644
--- a/builtin/cat-file.c
+++ b/builtin/cat-file.c
@@ -56,12 +56,22 @@ static int cat_one_file(int opt, const char *exp_type, 
const char *obj_name,
struct object_info oi = {NULL};
struct strbuf sb = STRBUF_INIT;
unsigned flags = LOOKUP_REPLACE_OBJECT;
+   unsigned sha1_flags = 0;
const char *path = force_path;
 
if (unknown_type)
flags |= LOOKUP_UNKNOWN_OBJECT;
 
-   if (get_sha1_with_context(obj_name, 0, oid.hash, &obj_context))
+   if (exp_type) {
+   if (!strcmp(exp_type, "commit"))
+   sha1_flags |= GET_SHA1_COMMITTISH;
+   else if(!strcmp(exp_type, "tree"))
+   sha1_flags |= GET_SHA1_TREEISH;
+   else if(!strcmp(exp_type, "blob"))
+   sha1_flags |= GET_SHA1_BLOB;
+   }
+
+   if (get_sha1_with_context(obj_name, sha1_flags, oid.hash, &obj_context))
die("Not a valid object name %s", obj_name);
 
if (!path)

> If you are doing "git show 235234" it should pick the tag (if it peels to a
> committish) because Git has already set a precedent of preferring tags over
> commits when it disambiguates ref names and otherwise pick the commit.

I'm not convinced that picking the tag is actually helpful in this case;
I agree with Linus that feeding something to "git show" almost always
wants to choose the commit.

I also don't think tag ambiguity in short sha1s is all that interesting.
There are a tiny number of tag objects. Most of your collisions are
going to be with trees or blobs, which should generally outnumber
commits by a factor of 5-10, though it depends on your workflow (git.git
does not have a deep tree, so it's only a factor of 4).

And if you just want to choose a committish over trees and blobs, well,
then; I invite you to check out the core.disambiguate patch I sent
elsewhere in the thread. :)

-Peff


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Jeff King
On Thu, Sep 29, 2016 at 04:46:19AM -0700, Kyle J. McKay wrote:

> This hint: information is excellent.  There needs to be a way to show it on
> demand.
> 
> $ git rev-parse --disambiguate=b2e1
> b2e11962c5e6a9c81aa712c751c83a743fd4f384
> b2e11d1bb40c5f81a2f4e37b9f9a60ec7474eeab
> b2e163272c01aca4aee4684f5c683ba341c1953d
> b2e18954c03ff502053cb74d142faab7d2a8dacb
> b2e1895ca92ec2037349d88b945ba64ebf16d62d
> 
> Not nearly so helpful, but the operation of --disambiguate cannot be changed
> without breaking current scripts.
> 
> Can your excellent "hint:" output above be attached to the --disambiguate
> option somehow, please.  Something like this perhaps:
> 
> $ git rev-parse --disambiguate-list=b2e1
> b2e1196 tag v2.8.0-rc1
> b2e11d1 tree
> b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
> b2e1759 blob
> b2e18954 blob
> b2e1895c blob

I think the "right" way to do this is pipe the list of sha1s into
another git commit which can format them however you want.
Unfortunately, there isn't a single command that does a great job:

  - "cat-file --batch-check" can show you the sha1 and type, but it
won't abbreviate sha1s, and it won't show you commit/tag information

  - "log --stdin --no-walk" will format the commit however you like, but
skips the trees and blobs entirely, and the tag can only be seen via
"%d"

  - "for-each-ref" has flexible formatting, too, but wants to format
refs, not objects (and doesn't read from stdin).

IMHO that is a sign that our formatting tools aren't as good as they
could be (I think the right tool is cat-file, but it should be able to
do all of the formatting that the other commands can do).

Of course if you really just want human-readable output, then:

  $ git cat-file -e b2e1
  error: short SHA1 b2e1 is ambiguous
  hint: The candidates are:
  hint:   b2e1196 tag v2.8.0-rc1
  hint:   b2e11d1 tree
  hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
  hint:   b2e1759 blob
  hint:   b2e18954 blob
  hint:   b2e1895c blob
  fatal: Not a valid object name b2e1

is pretty easy.

That being said, I don't mind if somebody wanted to do a rev-parse
option on top of my series. The formatting code is already split into
its own function.

-Peff


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Kyle J. McKay

On Sep 26, 2016, at 09:36, Linus Torvalds wrote:


On Mon, Sep 26, 2016 at 5:00 AM, Jeff King  wrote:


This patch teaches get_short_sha1() to list the sha1s of the
objects it found, along with a few bits of information that
may help the user decide which one they meant.


This looks very good to me, but I wonder if it couldn't be even more  
aggressive.


In particular, the only hashes that most people ever use in short form
are commit hashes. Those are the ones you'd use in normal human
interactions to point to something happening.

So when the disambiguation notices that there is ambiguity, but there
is only _one_ commit, maybe it should just have an aggressive mode
that says "use that as if it wasn't ambiguous".


If you have this:

faa23ec9b437812ce2fc9a5b3d59418d672debc1 refs/heads/ambig
7f40afe646fa3f8a0f361b6f567d8f7d7a184c10 refs/tags/ambig

and you do this:

$ git rev-parse ambig
warning: refname 'ambig' is ambiguous.
7f40afe646fa3f8a0f361b6f567d8f7d7a184c10

Git automatically prefers the tag over the branch, but it does spit  
out a warning.



And then have an explicit command (or flag) to do disambiguation for
when you explicitly want it.


I think you don't even need that.  Git already does disambiguation for  
ref names, picks one and spits out a warning.


Why not do the same for short hash names when it makes sense?


Rationale: you'd never care about short forms for tags. You'd just use
the tag name. And while blob ID's certainly show up in short form in
diff output (in the "index" line), very few people will use them. And
tree hashes are basically never seen outside of any plumbing commands
and then seldom in shortened form.

So I think it would make sense to default to a mode that just picks
the commit hash if there is only one such hash. Sure, some command
might want a "treeish", but a commit is still more likely than a tree
or a tag.

But regardless, this series looks like a good thing.


I like it too.

But perhaps it makes sense to actually pick one if there's only one  
disambiguation of the type you're looking for.


For example given:

235234a blob
2352347 tag
235234f tree
2352340 commit

If you are doing "git cat-file blob 235234" it should pick the blob  
and spit out a warning (and similarly for other cat-file types).  But  
"git cat-file -p 235234" would give the fatal error with the  
disambiguation hints because it wants type "any".


If you are doing "git show 235234" it should pick the tag (if it peels  
to a committish) because Git has already set a precedent of preferring  
tags over commits when it disambiguates ref names and otherwise pick  
the commit.


Lets consider this approach using the stats for the Linux kernel:


Ambiguous prefix length 7 counts:
  prefixes:   44733
   objects:   89766

Ambiguous length 11 (but not at length 12) info:
  prefixes:   2
  0 (with 1 or more commit disambiguations)

Ambiguous length 10 (but not at length 11) info:
  prefixes:  12
  3 (with 1 or more commit disambiguations)
  0 (with 2 or more commit disambiguations)

Ambiguous length 9 (but not at length 10) info:
  prefixes: 186
 43 (with 1 or more commit disambiguations)
  1 (with 2 or more commit disambiguations)

Ambiguous length 8 (but not at length 9) info:
  prefixes:2723
651 (with 1 or more commit disambiguations)
 40 (with 2 or more commit disambiguations)

Ambiguous length 7 (but not at length 8) info:
  prefixes:   41864
   9842 (with 1 or more commit disambiguations)
680 (with 2 or more commit disambiguations)


Of the 44733 ambiguous length 7 prefixes, only about 10539 of them  
disambiguate into one or more commit objects.


But if we apply the "spit a warning and prefer a commit object if  
there's only one and you're looking for a committish" rule, that drops  
the number from 10539 to about 721.  In other words, only about 7% of  
the previously ambiguous short commit SHA1 prefixes would continue to  
be ambiguous at length 7.  In fact it almost makes a prefix length of  
9 good enough, there's just the one at length 9 that disambiguates  
into more than one commit (45f014c52).


--Kyle


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-29 Thread Kyle J. McKay

On Sep 26, 2016, at 05:00, Jeff King wrote:


 $ git rev-parse b2e1
 error: short SHA1 b2e1 is ambiguous
 hint: The candidates are:
 hint:   b2e1196 tag v2.8.0-rc1
 hint:   b2e11d1 tree
 hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit- 
options'

 hint:   b2e1759 blob
 hint:   b2e18954 blob
 hint:   b2e1895c blob
 fatal: ambiguous argument 'b2e1': unknown revision or path not in  
the working tree.

 Use '--' to separate paths from revisions, like this:
 'git  [...] -- [...]'


This hint: information is excellent.  There needs to be a way to show  
it on demand.


$ git rev-parse --disambiguate=b2e1
b2e11962c5e6a9c81aa712c751c83a743fd4f384
b2e11d1bb40c5f81a2f4e37b9f9a60ec7474eeab
b2e163272c01aca4aee4684f5c683ba341c1953d
b2e18954c03ff502053cb74d142faab7d2a8dacb
b2e1895ca92ec2037349d88b945ba64ebf16d62d

Not nearly so helpful, but the operation of --disambiguate cannot be  
changed without breaking current scripts.


Can your excellent "hint:" output above be attached to the -- 
disambiguate option somehow, please.  Something like this perhaps:


$ git rev-parse --disambiguate-list=b2e1
b2e1196 tag v2.8.0-rc1
b2e11d1 tree
b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
b2e1759 blob
b2e18954 blob
b2e1895c blob

Any option name will do, --disambiguate-verbose, --disambiguate- 
extended, --disambiguate-long, --disambiguate-log, --disambiguate- 
help, --disambiguate-show-me-something-useful-to-humans-not-scripts ...


--Kyle


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-27 Thread Jeff King
On Mon, Sep 26, 2016 at 09:36:23AM -0700, Linus Torvalds wrote:

> On Mon, Sep 26, 2016 at 5:00 AM, Jeff King  wrote:
> >
> > This patch teaches get_short_sha1() to list the sha1s of the
> > objects it found, along with a few bits of information that
> > may help the user decide which one they meant.
> 
> This looks very good to me, but I wonder if it couldn't be even more
> aggressive.
> 
> In particular, the only hashes that most people ever use in short form
> are commit hashes. Those are the ones you'd use in normal human
> interactions to point to something happening.
> 
> So when the disambiguation notices that there is ambiguity, but there
> is only _one_ commit, maybe it should just have an aggressive mode
> that says "use that as if it wasn't ambiguous".

You can basically get that by using "1234^{commit}" all the time, as
that turns on the committish disambiguator function (though it's not
quite the same, as it would pick a tag, too; you really want the
commit-only disambiguator). But presumably you'd want it on all the
time. See the patch below, which lets you do:

  git config --global core.disambiguate commit

and I think should do what you want. I'm up in the air on whether it is
a good idea or not, but then I do not usually run into ambiguous sha1s.

> And then have an explicit command (or flag) to do disambiguation for
> when you explicitly want it.

In my patch you can tweak the config variable off, though it might make
sense to also have some per-short-sha1 syntax.

> Rationale: you'd never care about short forms for tags. You'd just use
> the tag name. And while blob ID's certainly show up in short form in
> diff output (in the "index" line), very few people will use them. And
> tree hashes are basically never seen outside of any plumbing commands
> and then seldom in shortened form.

I think I do sometimes "git show $blob_sha1" based on a diff index line.
OTOH, I don't think of though as "long-term" references. I'm usually
trying to apply the patch at the time, so it's fairly fresh (it's true
that the short-sha1 may have been generated on the sender's side, who
has fewer objects, but I doubt that's a big problem in general; the real
issue is that it was unique at one point, and isn't a few years later).

But more importantly, any fallback like this should take a backseat to
context provided by the rest of git. So for instance, the index-building
in "am -3" uses the blob disambiguator, and should continue to do so
(and does with my patch).

> So I think it would make sense to default to a mode that just picks
> the commit hash if there is only one such hash. Sure, some command
> might want a "treeish", but a commit is still more likely than a tree
> or a tag.

By the same rule I just mentioned above, if you use the short sha1 in a
treeish context, it will look for any treeish (so "1234:foo" would
continue to look for any treeish, not just a commit). So that might not
be as desirable, but I think it does make sense (and of course it will
still tell you immediately what the options are, and you can decide what
to do).

-- >8 --
Subject: [PATCH] get_short_sha1: make default disambiguation configurable

When we find ambiguous short sha1s, we may get a
disambiguation rule from our caller's context. But if we
don't, we fall back to treating all sha1s the same, even
though most projects will tend to refer only to commits by
their short sha1s.

This patch introduces a configuration option that lets the
user pick a different fallback (e.g., only commits). It's
possible that we may want to make this the default, but it's
a good idea to start as a config option for two reasons:

  1. It lets people experiment with this and see if it's a
 good idea (i.e., the "tend to" above is an assumption;
 we don't really know if this will break some obscure
 cases).

  2. Even if we do flip the default, it gives people an
 escape hatch if it causes problems (you can sometimes
 override it by asking for "1234^{tree}", but not all
 combinations are possible).

Signed-off-by: Jeff King 
---
 cache.h |  2 ++
 config.c|  3 +++
 sha1_name.c | 32 
 t/t1512-rev-parse-disambiguation.sh | 14 ++
 4 files changed, 51 insertions(+)

diff --git a/cache.h b/cache.h
index 5df0f33..b9583c4 100644
--- a/cache.h
+++ b/cache.h
@@ -1224,6 +1224,8 @@ extern int get_oid(const char *str, struct object_id 
*oid);
 typedef int each_abbrev_fn(const unsigned char *sha1, void *);
 extern int for_each_abbrev(const char *prefix, each_abbrev_fn, void *);
 
+extern int set_disambiguate_hint_config(const char *var, const char *value);
+
 /*
  * Try to read a SHA1 in hexadecimal format from the 40 characters
  * starting at hex.  Write the 20-byte result to sha1 in binary form.
diff --git a/config.c b/config.c
index 1e4b617..83fdecb 100644
--- a/config.c
+++ b/config.c
@@ -841,6 +841,9 @@ static int 

Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-26 Thread Jacob Keller
On Mon, Sep 26, 2016 at 9:36 AM, Linus Torvalds
 wrote:
> This looks very good to me, but I wonder if it couldn't be even more 
> aggressive.
>
> In particular, the only hashes that most people ever use in short form
> are commit hashes. Those are the ones you'd use in normal human
> interactions to point to something happening.
>
> So when the disambiguation notices that there is ambiguity, but there
> is only _one_ commit, maybe it should just have an aggressive mode
> that says "use that as if it wasn't ambiguous".
>
> And then have an explicit command (or flag) to do disambiguation for
> when you explicitly want it.
>
> Rationale: you'd never care about short forms for tags. You'd just use
> the tag name. And while blob ID's certainly show up in short form in
> diff output (in the "index" line), very few people will use them. And
> tree hashes are basically never seen outside of any plumbing commands
> and then seldom in shortened form.
>
> So I think it would make sense to default to a mode that just picks
> the commit hash if there is only one such hash. Sure, some command
> might want a "treeish", but a commit is still more likely than a tree
> or a tag.
>

I'd think we would want to phase this in over a few releases if we do
this? Maybe at least sort commits first in the list so that they are
faster to spot.

I am trying to think of what problems we'd cause by having the
behavior be this aggressive...

Thanks,
Jake

> But regardless, this series looks like a good thing.
>
> Linus


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-26 Thread Junio C Hamano
Jeff King  writes:

> Since it's attached to an error path, I'm guessing nobody will be too
> upset about it, so my inclination was to wait and let somebody add the
> conditional advice code if they're bothered.

Fair enough.  At that point of getting an error message, the only
thing they can do is to start wondering what object the person who
gave the now-non-unique abbrevation to them, so I suspect this is
one of the "advice" messages that can always be there.


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-26 Thread Jeff King
On Mon, Sep 26, 2016 at 10:30:48AM -0700, Junio C Hamano wrote:

> Jeff King  writes:
> 
> > We also restrict the list to those that match any
> > disambiguation hint. E.g.:
> >
> >   $ git rev-parse b2e1:foo
> >   error: short SHA1 b2e1 is ambiguous
> >   hint: The candidates are:
> >   hint:   b2e1196 tag v2.8.0-rc1
> >   hint:   b2e11d1 tree
> >   hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
> >   fatal: Invalid object name 'b2e1'.
> >
> > does not bother reporting the blobs, because they cannot
> > work as a treeish.
> 
> That's a nice touch, and it even comes free--how wonderful.
> 
> It somehow felt strange to have an expensive (compared to no-op,
> anyway) loop whose only externally visible effect is to call
> advise(), but there does not appear to be a way to even disable this
> advise() output, so it probably is OK, I guess.

Right, advise() always has an effect. But that reminds me.  I wasn't
sure if we should attach an advice.* config to this. If we do, then the
right place to put the conditional is right after the error() call in
get_short_sha1().

Since it's attached to an error path, I'm guessing nobody will be too
upset about it, so my inclination was to wait and let somebody add the
conditional advice code if they're bothered.

-Peff


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-26 Thread Junio C Hamano
Jeff King  writes:

> We also restrict the list to those that match any
> disambiguation hint. E.g.:
>
>   $ git rev-parse b2e1:foo
>   error: short SHA1 b2e1 is ambiguous
>   hint: The candidates are:
>   hint:   b2e1196 tag v2.8.0-rc1
>   hint:   b2e11d1 tree
>   hint:   b2e1632 commit 2007-11-14 - Merge branch 'bs/maint-commit-options'
>   fatal: Invalid object name 'b2e1'.
>
> does not bother reporting the blobs, because they cannot
> work as a treeish.

That's a nice touch, and it even comes free--how wonderful.

It somehow felt strange to have an expensive (compared to no-op,
anyway) loop whose only externally visible effect is to call
advise(), but there does not appear to be a way to even disable this
advise() output, so it probably is OK, I guess.

>  
> +test_expect_success C_LOCALE_OUTPUT 'ambiguity hints' '
> + test_must_fail git rev-parse 0 2>stderr &&
> + grep ^hint: stderr >hints &&
> + # 16 candidates, plus one intro line
> + test_line_count = 17 hints
> +'
> +
> +test_expect_success C_LOCALE_OUTPUT 'ambiguity hints respect type' '
> + test_must_fail git rev-parse 0^{commit} 2>stderr &&
> + grep ^hint: stderr >hints &&
> + # 5 commits, 1 tag (which is a commitish), plus intro line
> + test_line_count = 7 hints
> +'
> +
> +test_expect_success C_LOCALE_OUTPUT 'failed type-selector still shows hint' '
> + # these two blobs share the same prefix "ee3d", but neither
> + # will pass for a commit
> + echo 851 | git hash-object --stdin -w &&
> + echo 872 | git hash-object --stdin -w &&
> + test_must_fail git rev-parse ee3d^{commit} 2>stderr &&
> + grep ^hint: stderr >hints &&
> + test_line_count = 3 hints
> +'
> +
>  test_done


Re: [PATCH 10/10] get_short_sha1: list ambiguous objects on error

2016-09-26 Thread Linus Torvalds
On Mon, Sep 26, 2016 at 5:00 AM, Jeff King  wrote:
>
> This patch teaches get_short_sha1() to list the sha1s of the
> objects it found, along with a few bits of information that
> may help the user decide which one they meant.

This looks very good to me, but I wonder if it couldn't be even more aggressive.

In particular, the only hashes that most people ever use in short form
are commit hashes. Those are the ones you'd use in normal human
interactions to point to something happening.

So when the disambiguation notices that there is ambiguity, but there
is only _one_ commit, maybe it should just have an aggressive mode
that says "use that as if it wasn't ambiguous".

And then have an explicit command (or flag) to do disambiguation for
when you explicitly want it.

Rationale: you'd never care about short forms for tags. You'd just use
the tag name. And while blob ID's certainly show up in short form in
diff output (in the "index" line), very few people will use them. And
tree hashes are basically never seen outside of any plumbing commands
and then seldom in shortened form.

So I think it would make sense to default to a mode that just picks
the commit hash if there is only one such hash. Sure, some command
might want a "treeish", but a commit is still more likely than a tree
or a tag.

But regardless, this series looks like a good thing.

Linus