Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-11-16 Thread Jeff King
On Sat, Nov 14, 2015 at 02:35:01PM +0100, Andreas Krey wrote: > On Fri, 13 Nov 2015 19:01:18 +, Jeff King wrote: > ... > > 2. But for a little more work, pushing the is_git_directory() check > > out to the call-sites gives us probably saner semantics overall. > > Oops, now I get

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-11-14 Thread Andreas Krey
On Fri, 13 Nov 2015 19:01:18 +, Jeff King wrote: > > Can't we handle this in resolve_gitlink_ref itself? As I understand it, > > it should resolve a ref (here "HEAD") when path points to a submodule. > > When there isn't one it should return -1, so: > > I'm not sure. I think part of the

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-11-14 Thread Andreas Krey
On Fri, 13 Nov 2015 19:01:18 +, Jeff King wrote: ... > 2. But for a little more work, pushing the is_git_directory() check > out to the call-sites gives us probably saner semantics overall. Oops, now I get it[1]: You mean replacing resolve_gitlink_ref usages with is_git_directory,

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-11-13 Thread Andreas Krey
On Tue, 17 Mar 2015 01:48:00 +, Jeff King wrote: > On Mon, Mar 16, 2015 at 10:35:18PM -0700, Junio C Hamano wrote: > > > > It looks like we don't even really care about the value of HEAD. We just > > > want to know "is it a git directory?". I think in other places (like > > > "git add"), we

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-11-13 Thread Jeff King
On Fri, Nov 13, 2015 at 04:29:15PM +0100, Andreas Krey wrote: > > Likewise, I think dir.c:remove_dir_recurse is in a similar boat. > > Grepping for resolve_gitlink_ref, it looks like there may be others, > > too. > > Can't we handle this in resolve_gitlink_ref itself? As I understand it, > it

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-03-16 Thread Jeff King
[+cc Michael for get_ref_cache wisdom] On Mon, Mar 16, 2015 at 07:40:40PM +0100, Andreas Krey wrote: I am guessing that the repository has tons of submodules? Not a single one. Thats's thie interesting thing that makes me think I'm not actually solving the right problem. This repo has

[PATCH] refs.c: get_ref_cache: use a bucket hash

2015-03-16 Thread Andreas Krey
get_ref_cache used a linear list, which obviously is O(n^2). Use a fixed bucket hash which just takes a factor of 10 (~ 317^2) out of the n^2 - which is enough. Signed-off-by: Andreas Krey a.k...@gmx.de --- This brings 'git clean -ndx' times down from 17 minutes to 11 seconds on one of our

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-03-16 Thread Junio C Hamano
Andreas Krey a.k...@gmx.de writes: get_ref_cache used a linear list, which obviously is O(n^2). Use a fixed bucket hash which just takes a factor of 10 (~ 317^2) out of the n^2 - which is enough. Signed-off-by: Andreas Krey a.k...@gmx.de --- This brings 'git clean -ndx' times down

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-03-16 Thread Thomas Gummerer
Hi, On 03/16, Andreas Krey wrote: get_ref_cache used a linear list, which obviously is O(n^2). Use a fixed bucket hash which just takes a factor of 10 (~ 317^2) out of the n^2 - which is enough. Signed-off-by: Andreas Krey a.k...@gmx.de --- This brings 'git clean -ndx' times down from

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-03-16 Thread Junio C Hamano
Jeff King p...@peff.net writes: The get_ref_cache code was designed to scale to the actual number of submodules. I do not mind seeing it become a hash if people really do have a large number of submodules, but that is not what is happening here. ... So git-clean speculatively asks what is

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-03-16 Thread Jeff King
On Mon, Mar 16, 2015 at 10:35:18PM -0700, Junio C Hamano wrote: It looks like we don't even really care about the value of HEAD. We just want to know is it a git directory?. I think in other places (like git add), we just do an existence check for $dir/.git. That would not catch a bare

Re: [PATCH] refs.c: get_ref_cache: use a bucket hash

2015-03-16 Thread Andreas Krey
On Mon, 16 Mar 2015 10:23:05 +, Junio C Hamano wrote: Andreas Krey a.k...@gmx.de writes: ... say a lot of ignored directories, but do you mean directories in the working tree (which I suppose do not have much to do with the submodule_ref_caches[])? Apparently, they do. I am guessing