On Wed, Aug 1, 2012 at 9:01 AM, Jeff King <p...@peff.net> wrote:
> On Wed, Aug 01, 2012 at 08:10:12AM +0700, Nguyen Thai Ngoc Duy wrote:
>
>> > I do not think that is the right direction. Let's imagine that I have a
>> > commit "A" and I annotate it (via notes or whatever) to say "between
>> > A^^{tree} and A^{tree}, foo.c became bar.c". That will help me when
>> > doing "git show" or "git log". But it will not help me when I later try
>> > to merge "A" (or its descendent). In that case, I will compute the diff
>> > between "A" and the merge-base (or worse, some descendent of "A" and the
>> > merge-base), and I will miss this hint entirely.
>> >
>> > A much better hint is to annotate pairs of sha1s, to say "do not bother
>> > doing inexact rename correlation on this pair; I promise that they have
>> > value N".
>>
>> I haven't had time to think it through yet but I throw my thoughts in
>> any way. I actually went with your approach first. But it's more
>> difficult to control the renaming. Assume we want to tell git to
>> rename SHA-1 "A" to SHA-1 "B". What happens if we have two As in the
>> source tree and two Bs in the target tree? What happens if two As and
>> one B, or one A and two Bs? What if a user defines A -> B and A -> C,
>> and we happen to have two As in source tree and B and C in target
>> tree?
>
> Yes, it disregards path totally. But if you had the exact same movement
> of content from one path to another in one instance, and it is
> considered a rename, wouldn't it also be a rename in a second instance?

Yes. This is probably cosmetics only, but without path information, we
leave it to chance to decide which A to pair with B and C (in the
A->B, A->C case above). Wrong path might lead to funny effects (i'm
thinking of git log --follow).

>> There's also the problem with transferring this information. With
>> git-notes I think I can transfer it (though not automatically). How do
>> we transfer sha1 map (that you mentioned in the commit generation mail
>> in this thread)?

I wasn't clear. This is about transferring info across repositories.

> That is orthogonal to the issue of what is being stored. I chose my
> mmap'd disk implementation because it is very fast, which makes it nice
> for a performance cache. But you could store the same thing in git-notes
> (indexed by dst sha1, I guess, and then pointing to a blob of (src,
> score) pairs.
>
> If you want to include path-based hints in a commit, I'd say that using
> some micro-format in the commit message would be the simplest thing.

Rename correction is after the commit is created. I don't think we can
recreate commits.

> But
> that has been discussed before; ultimately the problem is that it only
> covers _one_ diff that we do with that commit (it is probably the most
> common, of course, but it doesn't cover them all).

How about we generate sha1 mapping from commit hints? We try to take
advantage of path hints when we can. Else we fall back to sha-1
mapping. This way we can transfer commit hints as git-notes to another
repo, then regenerate sha-1 mapping there. No need to transfer sha1
maps.

>> > Then it will find that pair no matter which trees or commits
>> > are being diffed, and it will do so relatively inexpensively[1].
>>
>> But does that happen often in practice? I mean diff-ing two arbitrary
>> trees and expect rename correction. I disregarded it as "git log" is
>> my main case, but I'm just a single user..
>
> It happens every time merge-recursive does rename detection, which
> includes "git merge" but also things like "cherry-pick".

Thanks. I'll look into merge/cherry-pick.
-- 
Duy
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Reply via email to