[git-users] git describe's way of choosing the "most recent" tag

2017-08-05 Thread Kévin Le Gouguec
Hi,

Not sure this is a bug; I might just misunderstand git-describe's
algorithm. I am on Debian Jessie with git version 2.1.4; I also get
the same behavior on next (98096fd7a85b93626db8757f944f2d8ffdf7e96a).

I am trying to get the most recent tag on Emacs's repository[1]. With
master checked out (at eaa5dc9d102d10c79f10bee1994ad922b8fcf9c4),
running

$ git describe --tags --debug --match 'emacs-*' 

Yields:

searching to describe HEAD
 lightweight   129568 emacs-25.1
 lightweight 3807 emacs-25.1-rc2
 lightweight 3847 emacs-25.1-rc1
 annotated   3906 emacs-25.2
 annotated   3941 emacs-25.2-rc2
 lightweight 3947 emacs-25.0.95
 annotated   3957 emacs-25.2-rc1
 annotated   4001 emacs-25.1.91
 lightweight 4026 emacs-25.0.94
 lightweight 4050 emacs-25.1.90
traversed 129978 commits
more than 10 tags found; listed 10 most recent
gave up search at 5c587fdff164e8b90beb47f6da64b4884290e40a
emacs-25.1-129568-geaa5dc9

I am somewhat surprised that git does not choose "emacs-25.2". The
manpage says:

If multiple tags were found during the walk then the tag which has
the fewest commits different from the input commit-ish will be
selected and output. Here fewest commits different is defined as
the number of commits which would be shown by git log tag..input
will be the smallest number of commits possible.

If I run "git log tag..input":

$ git log --oneline emacs-25.1.. | wc -l
4847
$ git log --oneline emacs-25.2.. | wc -l
4514

I added a bunch of debugging prints to compare_pt() in
builtin/describe.c:

+   fprintf(stderr, "comparing %s against %s\n",
+   a->name->path, b->name->path);
+
if (a->depth != b->depth) {
+   fprintf(stderr, "\tdepths: %d-%d = %d\n",
+a->depth, b->depth, a->depth-b->depth);
return a->depth - b->depth;
}

Sample:

comparing emacs-25.2 against emacs-25.1
depths: 3906-3790 = 116

I see that after sorting, describe() calls finish_depth_computation(),
which updates the depth of each candidate. As far as I can see, this
update is only reflected in --debug's output, and does not change
which candidate is deemed "best". Is this on purpose?

In advance, thank you for your patience. I apologize for

- not having run blame or log on builtin/describe.c yet; that could
  show me something that would help me understand what is going on;

- not being able to reduce the test case (I tried to create a smaller
  repository with a somewhat similar topology, to no avail);

- maybe having missed something from the branch topology that explains
  this result;

- maybe having gone cross-eyed on the code and misinterpreted it.



[1]: https://git.savannah.gnu.org/git/emacs.git

To provide an overview of the topology, I tried running

$ git log --graph --oneline --decorate --simplify-by-decoration 

But there is some noise coming from merged feature branches. To
simplify the picture, if we only consider master and the emacs-25
maintenance branch, the graph looks like this:

* (master)
*\
*\* (tag: emacs-25.2, emacs-25)
|\* (tag: emacs-25.2-rc2)
| * (tag: emacs-25.2-rc1)
* |
*\|
*\* (tag: emacs-25.1)
|\* (tag: emacs-25.1-rc2)
| * (tag: emacs-25.1-rc1)
|/
*

I guess it is possible that some feature branches make the topology
complex enough that weird things may happen during traversal; for
example, the "concurrency" branch started off master before emacs-25,
and was merged between 25.1 and 25.2.

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to git-users+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


[git-users] Re: git describe's way of choosing the "most recent" tag

2017-08-05 Thread G. Sylvie Davies


On Saturday, August 5, 2017 at 2:04:07 PM UTC-7, Kévin Le Gouguec wrote:
>
> Hi,
>
> Not sure this is a bug; I might just misunderstand git-describe's
> algorithm. I am on Debian Jessie with git version 2.1.4; I also get
> the same behavior on next (98096fd7a85b93626db8757f944f2d8ffdf7e96a).
>
> I am trying to get the most recent tag on Emacs's repository[1]. With
> master checked out (at eaa5dc9d102d10c79f10bee1994ad922b8fcf9c4),
> running
>
> $ git describe --tags --debug --match 'emacs-*' 
>
> Yields:
>
> searching to describe HEAD
>  lightweight   129568 emacs-25.1
>  lightweight 3807 emacs-25.1-rc2
>  lightweight 3847 emacs-25.1-rc1
>  annotated   3906 emacs-25.2
>  annotated   3941 emacs-25.2-rc2
>  lightweight 3947 emacs-25.0.95
>  annotated   3957 emacs-25.2-rc1
>  annotated   4001 emacs-25.1.91
>  lightweight 4026 emacs-25.0.94
>  lightweight 4050 emacs-25.1.90
> traversed 129978 commits
> more than 10 tags found; listed 10 most recent
> gave up search at 5c587fdff164e8b90beb47f6da64b4884290e40a
> emacs-25.1-129568-geaa5dc9
>
> I am somewhat surprised that git does not choose "emacs-25.2". The
> manpage says:
>
> If multiple tags were found during the walk then the tag which has
> the fewest commits different from the input commit-ish will be
> selected and output. Here fewest commits different is defined as
> the number of commits which would be shown by git log tag..input
> will be the smallest number of commits possible.
>
> If I run "git log tag..input":
>
> $ git log --oneline emacs-25.1.. | wc -l
> 4847
> $ git log --oneline emacs-25.2.. | wc -l
> 4514
>
> I added a bunch of debugging prints to compare_pt() in
> builtin/describe.c:
>
> +   fprintf(stderr, "comparing %s against %s\n",
> +   a->name->path, b->name->path);
> +
> if (a->depth != b->depth) {
> +   fprintf(stderr, "\tdepths: %d-%d = %d\n",
> +a->depth, b->depth, a->depth-b->depth);
> return a->depth - b->depth;
> }
>
> Sample:
>
> comparing emacs-25.2 against emacs-25.1
> depths: 3906-3790 = 116
>
> I see that after sorting, describe() calls finish_depth_computation(),
> which updates the depth of each candidate. As far as I can see, this
> update is only reflected in --debug's output, and does not change
> which candidate is deemed "best". Is this on purpose?
>
> In advance, thank you for your patience. I apologize for
>
> - not having run blame or log on builtin/describe.c yet; that could
>   show me something that would help me understand what is going on;
>
> - not being able to reduce the test case (I tried to create a smaller
>   repository with a somewhat similar topology, to no avail);
>
> - maybe having missed something from the branch topology that explains
>   this result;
>
> - maybe having gone cross-eyed on the code and misinterpreted it.
>
>
>
> [1]: https://git.savannah.gnu.org/git/emacs.git
>
> To provide an overview of the topology, I tried running
>
> $ git log --graph --oneline --decorate --simplify-by-decoration 
>
> But there is some noise coming from merged feature branches. To
> simplify the picture, if we only consider master and the emacs-25
> maintenance branch, the graph looks like this:
>
> * (master)
> *\
> *\* (tag: emacs-25.2, emacs-25)
> |\* (tag: emacs-25.2-rc2)
> | * (tag: emacs-25.2-rc1)
> * |
> *\|
> *\* (tag: emacs-25.1)
> |\* (tag: emacs-25.1-rc2)
> | * (tag: emacs-25.1-rc1)
> |/
> *
>
> I guess it is possible that some feature branches make the topology
> complex enough that weird things may happen during traversal; for
> example, the "concurrency" branch started off master before emacs-25,
> and was merged between 25.1 and 25.2.
>
>

I certainly agree with you -- git describe's behaviour in this example does 
not make any sense.

I know a slightly klunky way to get the most recent tag using git 
for-each-ref, so I thought I'd share it:

I run these two commands:

# get most recent annotated tag (by time-of-tagging)
$ git for-each-ref  --sort='-*committerdate'  refs/tags | head --lines=1

# get most recent lightweight tag (by time-of-commit)
$ git for-each-ref  --sort='-committerdate'  refs/tags | head --lines=1

The asterisk is what makes the difference between looking at annotated tags 
and lightweight tags.

And then I compare the two results and hope I'm not dealing with the 
situation where a really old commit was recently given an annotated tag.

In your example these return "refs/tags/emacs-25.2" and 
"refs/tags/emacs-25.1.90" respectively.

If anyone knows a better way, I'd love to hear about it!


- Sylvie





 

-- 
You received this message because you are subscribed to the Google Groups "Git 
for human beings" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to