On Thu, Aug 02, 2012 at 03:51:17PM -0700, Junio C Hamano wrote:
> > On Wed, Aug 01, 2012 at 03:10:55PM -0700, Junio C Hamano wrote:
> > ...
> >> When you move porn/0001.jpg in the preimage to naughty/1.jpg in
> >> the postimage, they both can hit "*.jpg contentid=jpeg" line in the
> >> top-lev
Jeff King writes:
> On Wed, Aug 01, 2012 at 03:10:55PM -0700, Junio C Hamano wrote:
> ...
>> When you move porn/0001.jpg in the preimage to naughty/1.jpg in
>> the postimage, they both can hit "*.jpg contentid=jpeg" line in the
>> top-level .gitattribute file, and the contentid driver for jpe
On Thu, Aug 02, 2012 at 07:08:25PM +0700, Nguyen Thai Ngoc Duy wrote:
> > I implemented (1a). Implementing (1b) would be easy, but for a full-on
> > cache (especially for "-C"), I think the resulting size might be
> > prohibitive.
>
> (1a) is good regardless rename overrides. Why don't you polish
On Wed, Aug 01, 2012 at 03:10:55PM -0700, Junio C Hamano wrote:
> Jeff King writes:
>
> > On Tue, Jul 31, 2012 at 11:01:27PM -0700, Junio C Hamano wrote:
> > ...
> >> As we still have the pathname in this codepath, I am wondering if we
> >> would benefit from custom "content hash" that knows the
On Thu, Aug 2, 2012 at 4:27 AM, Jeff King wrote:
> Yes, if you go with a commit-based approach, you can do either notes or
> in-commit messages. In other words, I would break the solutions down as:
>
> 1. Store sha1+sha1 -> score mapping (i.e., what I suggested). This is
> fundamentally a g
Jeff King writes:
> On Tue, Jul 31, 2012 at 11:01:27PM -0700, Junio C Hamano wrote:
>
>> - As entries in rename cache that record high scores have names of
>>"similar" blobs, pack-objects may be able to take advantage of
>>this information.
>
> Yeah, although I suspect it is not as big a
Jeff King writes:
> On Tue, Jul 31, 2012 at 11:01:27PM -0700, Junio C Hamano wrote:
> ...
>> As we still have the pathname in this codepath, I am wondering if we
>> would benefit from custom "content hash" that knows the nature of
>> payload than the built-in similarity estimator, driven by the
>
On Tue, Jul 31, 2012 at 11:01:27PM -0700, Junio C Hamano wrote:
> > @@ -175,6 +177,11 @@ static int estimate_similarity(struct diff_filespec
> > *src,
> > if (max_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE)
> > return 0;
> >
> > + hashcpy(pair.one, src->sha1);
>
On Wed, Aug 01, 2012 at 01:34:23PM +0700, Nguyen Thai Ngoc Duy wrote:
> On Wed, Aug 1, 2012 at 1:09 PM, Junio C Hamano wrote:
> > Nguyen Thai Ngoc Duy writes:
> >
> >> Yes. This is probably cosmetics only, but without path information, we
> >> leave it to chance to decide which A to pair with B
On Wed, Aug 01, 2012 at 11:36:00AM +0700, Nguyen Thai Ngoc Duy wrote:
> > That is orthogonal to the issue of what is being stored. I chose my
> > mmap'd disk implementation because it is very fast, which makes it nice
> > for a performance cache. But you could store the same thing in git-notes
> >
On Wed, Aug 1, 2012 at 1:09 PM, Junio C Hamano wrote:
> Nguyen Thai Ngoc Duy writes:
>
>> Yes. This is probably cosmetics only, but without path information, we
>> leave it to chance to decide which A to pair with B and C (in the
>> A->B, A->C case above). Wrong path might lead to funny effects (
Nguyen Thai Ngoc Duy writes:
> Yes. This is probably cosmetics only, but without path information, we
> leave it to chance to decide which A to pair with B and C (in the
> A->B, A->C case above). Wrong path might lead to funny effects (i'm
> thinking of git log --follow).
Isn't that why and ca
Jeff King writes:
> @@ -175,6 +177,11 @@ static int estimate_similarity(struct diff_filespec *src,
> if (max_size * (MAX_SCORE-minimum_score) < delta_size * MAX_SCORE)
> return 0;
>
> + hashcpy(pair.one, src->sha1);
> + hashcpy(pair.two, dst->sha1);
> + if (renam
On Wed, Aug 1, 2012 at 9:01 AM, Jeff King wrote:
> On Wed, Aug 01, 2012 at 08:10:12AM +0700, Nguyen Thai Ngoc Duy wrote:
>
>> > I do not think that is the right direction. Let's imagine that I have a
>> > commit "A" and I annotate it (via notes or whatever) to say "between
>> > A^^{tree} and A^{tr
On Wed, Aug 01, 2012 at 08:10:12AM +0700, Nguyen Thai Ngoc Duy wrote:
> > I do not think that is the right direction. Let's imagine that I have a
> > commit "A" and I annotate it (via notes or whatever) to say "between
> > A^^{tree} and A^{tree}, foo.c became bar.c". That will help me when
> > doi
On Wed, Aug 1, 2012 at 2:23 AM, Jeff King wrote:
>> It is a good direction to go in, I would think, to give users a way
>> to explicitly tell that "in comparison between these two trees, I
>> know path B in the postimage corresponds to path A in the preimage".
>
> I do not think that is the right
On Tue, Jul 31, 2012 at 01:20:42PM -0700, Junio C Hamano wrote:
> Jeff King writes:
>
> > A much better hint is to annotate pairs of sha1s, to say "do not bother
> > doing inexact rename correlation on this pair; I promise that they have
> > value N".
>
> Surely. And I suspect that the patch t
Jeff King writes:
> A much better hint is to annotate pairs of sha1s, to say "do not bother
> doing inexact rename correlation on this pair; I promise that they have
> value N".
Surely. And I suspect that the patch to the current codebase to do
so would be much less impact if you go that way.
On Tue, Jul 31, 2012 at 09:32:49AM -0700, Junio C Hamano wrote:
> Nguyen Thai Ngoc Duy writes:
>
> > The above output is done with "git diff --manual-rename=foo A B"
> > and "foo" contains (probably not in the best format though)
> >
> > -- 8< --
> > attr.c dir.c
> > dir.c attr.c
> > -- 8< --
>
Nguyen Thai Ngoc Duy writes:
> The above output is done with "git diff --manual-rename=foo A B"
> and "foo" contains (probably not in the best format though)
>
> -- 8< --
> attr.c dir.c
> dir.c attr.c
> -- 8< --
> ...
> Comments?
It is a good direction to go in, I would think, to give users a wa
20 matches
Mail list logo