Am 22.10.2013 00:58, schrieb pro-logic:
>> The trace_performance functions require manual instrumentation of
>> the code sections you want to measure
> Ahh a case of RTFM :)
>> Could you post details about your test setup? Are you still using
>> WebKit for your tests?
> I'm on Win7 x64, Core i5 M560, WD 7200 Laptop HDD, NTSF, no virus
> scanner, truecrypt, no defragger.

OK, so truecrypt and luafv may screw things up for you (according to my 
measurements, luafv roughly doubles lstat times on C:).

> I've tried to be a bit smarter with the intent of my code, and this
> is what I came up with.
> diff --git a/cache.h b/cache.h
> index 4bf19e3..2e9fb1f 100644
> --- a/cache.h
> +++ b/cache.h
> @@ -294,7 +294,7 @@ extern void free_name_hash(struct index_state *istate);
>  #define active_cache_changed (the_index.cache_changed)
>  #define active_cache_tree (the_index.cache_tree)
> -#define read_cache() read_index(&the_index)
> +#define read_cache() read_index_preload(&the_index, NULL)
>  #define read_cache_from(path) read_index_from(&the_index, (path))
>  #define read_cache_preload(pathspec) read_index_preload(&the_index, 
> (pathspec))
>  #define is_cache_unborn() is_index_unborn(&the_index)
> diff --git a/read-cache.c b/read-cache.c
> index c3d5e35..5fb2788 100644
> --- a/read-cache.c
> +++ b/read-cache.c
> @@ -1866,7 +1866,7 @@ int read_index_unmerged(struct index_state *istate)
>  int i;
>  int unmerged = 0;
> -read_index(istate);
> +read_index_preload(istate, NULL);
>  for (i = 0; i < istate->cache_nr; i++) {
>  struct cache_entry *ce = istate->cache[i];
>  struct cache_entry *new_ce;
> -- 

Ahh, I thought that you had enabled fscache during the entire checkout.

> Interestingly when I run on a cleanly checked out blink repo my
> changes seem to make matters worse in terms of performance, but when
> working on a repo with ignored files in it it seems to work better.
> So for point of comparison I decided to run it on a comparison on a
> repo with working ignored files in it in this case msysgit/git after
> a 'make install'. When I get a few hours I'll try to build blink and
> re-run the numbers on a much much larger repo.
> This comparison is a average of 3 cold cache runs of the
> kb/fscache-v4 [a] vs kb/fscache-v4 with my above changes applied [b],
> with preloadindex and fscache set to true.
> For comparison
> git status -s
> [a] 3.02s
> [b] 2.92s
> git reset --hard head
> [a] 3.67s
> [b] 3.09s

These numbers look far too good, so you don't actually do a fresh checkout, do 
you? I mean, delete all files except .git; killcache; git reset --hard / git 
checkout -f? That would also explain your 95% lstat times, if there's nothing 
to do...

> git add -u
> [a] 2.89s
> [b] 2.08s
> I noticed something interesting. Preload index uses 20 threads to do
> the work. When I was keeping an eye on them in task manager some
> threads will finish quite quickly, while others will run a lot
> longer. The way I understand the code at the moment the threads get
> equal chunks of work to perform. It's quite lilkely that even more
> performance could be obtained out of preload if the work splitting
> was 'smarter'. My currently best idea would be to use something like
> a lock-free queue to queue up the work and let the threads get the
> work of the queue. That way all threads are busy with work for
> longer. A candidate for the implementation would be libfds [1] queue.
> However my issue with this library and the reason I haven't tried to
> integrate is simply because the code expressly has no license.

As cache/cache_nr are not modified by the threads, you actually don't need a 
lock-free queue. An atomic counter shared by all threads should suffice (i.e. 
pthread's equivalent to InterlockedIncrement/InterlockedAdd).


To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to
More majordomo info at

Reply via email to