On Fri, Apr 18, 2014 at 4:36 PM, Junio C Hamano <gits...@pobox.com> wrote:
> "Luis R. Rodriguez" <mcg...@do-not-panic.com> writes:
>
>> I think ultimately this reveals that given that tags *can* be
>> arbitrary and subjective,...
>
> Yes; see the part at the bottom.
>
>>> Commit A can be described in terms of both v3.4 and v9.0,
>>
>> And in the real example case, why *would* c5905afb' be be described in
>> terms of v3.5 instead of v3.4 ?
>
> I am not interested in graphing that particular history between v3.4
> and v3.5 myself.  If you are interested, I already gave you enough
> information on how to figure that out.

I was alluding to another possible issue here, my concern was that the
commit's parent (which is not really the point at which it was merged,
but rather where the topic got forked off to be worked on) could be
used for as reference points but clearly its not given the nature of
how name-rev was implemented. I still do see some possible issues with
it's parent on other commands (but I haven't studied the other's
implementation) that reveals some of my original concerns, but its
unclear if they are related. I also found that if we didn't want to
rely on dates or start defining naming convention we may want to
reconsider the name_rev() recursive implementation. I'll illustrate a
few results that might help to show my concerns for both other
commands perhaps using the parent erroneously, and a possible
alternative implementation for name_rev() or at the very least
contains.

[0] mcgrof@ergon ~/linux (git::master)$ git log c5905afb..v3.5| grep
^commit | wc -l
24878
[1] mcgrof@ergon ~/linux (git::master)$ git log c5905afb..v3.4| grep
^commit | wc -l
13106
[2] mcgrof@ergon ~/linux (git::master)$ git log c5905afb..v3.3| grep
^commit | wc -l
1360

Now that I revised name_rev.c I see the recursive nature of name_rev()
works top down from each tag down to each v* tag object and for each
actual commit pegs a name on it. How we rule out each tag under this
implementation is not that obvious to me, specially when results like
[0] and [1] reveal v3.4 should be 'shorter' in light of number of
commits. I see now how we don't update a commit's name if other
crucial information such as the ones discussed on this thread might be
important for the user, and I can see how this can help but an
alternative approach, which is what I expected to see implemented at
least for 'git describe --contains', would have been to see how many
commits are present from the commit's *merged* upstream parent (not
the actual parent as in c5905afb's commit case its v3.3 which is not
where it got merged). Getting the smallest number of commits under
this logic and stopping when we don't find any commits should yield us
the base tag under which the commit was merged, without any heuristics
on dates. This however applies to Linux though given that we don't
merge commits on stable branches but rather create new commits and
reference the upstream sha1sum, a practice which also solves the
problem Jeff pointed out.

The results for command [2] above however a bit surprising, I'd take a
look but I should go back to look at other stuff, figured I'd at least
bring it up now as it seems relevant.

>>>     - find candidate tags that can be used to "describe --contains"
>>>       the commit A, yielding v3.4, v3.5 (not shown), and v9.0;
>>
>>>     - among the candidate tags, cull the ones that contain another
>>>       candidate tag, rejecting v3.5 (not shown) and v9.0;
>>
>>>     - among the surviving tags, pick the closest.
>>>
>>> Hmm?
>>
>> Sounds good to me!
>
> Not so fast ;-)
>
> My other message to Peff in response to his another example has an
> updated position on this.  "Reject candidates that can reach other
> candidates" is universally correct, but after that point, there are
> at least three but probably more options that suit preference of
> different people and project to break ties:
>
>  - Your case that started this thread may want to favor v3.4 if only
>    because that v3.4 _sounds_ smaller than v4.0 (in Peff's example),
>    even when v3.4 and v4.0 do not have ancestry relationship.
>
>  - The "closest" we have had is a heuristic to produce a result that
>    is textually shorter.
>
>  - And as I alluded to, "which one has the earliest timestamp?", is
>    another valid question to ask.

The first one above can be subjective if and only if the Linux
upstream model of dealing with stable branches is not followed. In
other words I think its a non issue if you create new commits on the
stable branches instead of merge stuff onto them. This however is
technical practice and I guess not everyone follows.

> And there may be more to appear.  A new command line option (and
> possibly a new configuration) to choose from these three (and more
> heuristics that will be added later) would be necessary.

Yeah this is rather complex, the resolutions to the issue in the ways
you've described seem reasonable to me but do wonder if this can be
simplified by reevaluating how the candidates are considered. You'd
know better :)

 Luis
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to