Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-29 Thread John Keeping
On Wed, May 29, 2013 at 11:48:53AM -0700, Junio C Hamano wrote: > Jeff King writes: > > > On Wed, May 29, 2013 at 11:08:46AM -0700, Junio C Hamano wrote: > > > >> This has rather interesting ramifications on cherry-pick and rebase, > >> though. Both command can handle changes that come from an o

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-29 Thread Junio C Hamano
Jeff King writes: > On Wed, May 29, 2013 at 11:08:46AM -0700, Junio C Hamano wrote: > >> This has rather interesting ramifications on cherry-pick and rebase, >> though. Both command can handle changes that come from an old tree >> before some paths were renamed, but strict patch-id would not spo

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-29 Thread Jeff King
On Wed, May 29, 2013 at 11:08:46AM -0700, Junio C Hamano wrote: > This has rather interesting ramifications on cherry-pick and rebase, > though. Both command can handle changes that come from an old tree > before some paths were renamed, but strict patch-id would not spot > equivalent changes we

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-29 Thread Junio C Hamano
Jeff King writes: > I think such a loose patch-id could just be a hash of the filenames that > were changed by the patch (e.g., the first 32-bits of the sha1 of the > concatenated filenames). Computing that should be about as expensive as > a tree-diff. Per observation 2 above, if two commits do

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-29 Thread Jeff King
On Wed, May 29, 2013 at 03:22:25AM -0400, Jeff King wrote: > revs=origin/master...origin/jk/submodule-subdirectory-ok > stock|you |me > --- > real 0m0.501s | 0m0.078s | 0m0.098s > user 0m0.480s | 0m0.056s | 0m0.084s > sys 0m0.01

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-29 Thread Jeff King
On Wed, May 29, 2013 at 02:20:07AM -0400, Jeff King wrote: > In the best case, we compute no patch-ids at all. And even for the > average case, I'd expect our lazy calculation to only have to compute a > handful of ids. Here is a not-well-tested version of the idea. I tried to contain the changes

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-28 Thread Jeff King
On Sun, May 19, 2013 at 02:17:35PM +0100, John Keeping wrote: > When using "git cherry" or "git log --cherry-pick" we often have a small > number of commits on one side and a large number on the other. In > revision.c::cherry_pick_list we store the patch IDs for the small side > before comparing

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-20 Thread John Keeping
On Sun, May 19, 2013 at 11:36:23PM -0700, Jonathan Nieder wrote: > I don't know what it should mean to try to use --cherry without > --no-merges or --first-parent, so I guess this is harmless. Currently --no-merges doesn't actually get passed down this far. We do the patch ID calculations without

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-19 Thread Jonathan Nieder
John Keeping wrote: > On Sun, May 19, 2013 at 12:36:12PM -0700, Jonathan Nieder wrote: >>Who is responsible for freeing >> "path"? Would it make sense to use a strbuf here? >> >> strbuf_setlen(&buf, traverse_path_len(info, n)); >> make_traverse_pa

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-19 Thread Junio C Hamano
Jonathan Nieder writes: >> @@ -64,6 +199,13 @@ static struct patch_id *add_commit(struct commit *commit, >> unsigned char sha1[20]; >> int pos; >> >> +if (no_add) { >> +if (collect_touched_paths(commit, ids, 1) < 0) >> +return NULL; > > Ah, so this

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-19 Thread John Keeping
On Sun, May 19, 2013 at 12:36:12PM -0700, Jonathan Nieder wrote: > John Keeping wrote: > > > In this case, it is likely that only a small number of paths are touched > > by the commits on the smaller side of the range and by storing these we > > can ignore many commits on the other side before unp

Re: [RFC/PATCH] patch-ids: check modified paths before calculating diff

2013-05-19 Thread Jonathan Nieder
John Keeping wrote: > In this case, it is likely that only a small number of paths are touched > by the commits on the smaller side of the range and by storing these we > can ignore many commits on the other side before unpacking blobs and > diffing them. I like this idea a lot. > --- a/patch-id