Re: [PATCH] update-index/diff-index: use core.preloadindex to improve performance

2012-11-13 Thread karsten . blees
Jeff King p...@peff.net wrote on 02.11.2012 16:26:16:

 On Tue, Oct 30, 2012 at 10:50:42AM +0100, karsten.bl...@dcon.de wrote:
 
  'update-index --refresh' and 'diff-index' (without --cached) don't 
honor
  the core.preloadindex setting yet. Porcelain commands using these 
(such as
  git [svn] rebase) suffer from this, especially on Windows.
  
  Use read_cache_preload to improve performance.
  
  Additionally, in builtin/diff.c, don't preload index status if we 
don't
  access the working copy (--cached).
  
  Results with msysgit on WebKit repo (2GB in 200k files):
  
  | update-index | diff-index | rebase
  +--++-
  msysgit-v1.8.0  |   9.157s |10.536s | 42.791s
  + preloadindex  |   9.157s |10.536s | 28.725s
  + this patch|   2.329s | 2.752s | 15.152s
  + fscache [1]   |   0.731s | 1.171s |  8.877s
 
 Cool numbers. On my quad-core SSD Linux box, I saw a few speedups, too.
 Here are the numbers for update-index --refresh on the WebKit repo
 (all are wall clock time, best-of-five):
 
  | before | after
   ---++
   cold cache | 4.513s | 2.059s
   warm cache | 0.252s | 0.164s


Great, and thanks for testing on Linux (I only have Linux VMs for testing, 
and I couldn't get meaningful performance data from those).
 
 Not as dramatic, but still nice. I wonder how a spinning disk would fare
 on the cold-cache case, though.  I also tried it with all but one CPU
 disabled, and the warm cache case was a little bit slower.
 

Unfortunately, with a 'real' disk, cold cache times increase with the 
number of threads. I've played around with '#define MAX_PARALLEL 20' in 
preload-index.c, with the following results ('git update-index --refresh' 
on WebKit repo, Win7 x64, Core i7 860 (2.8GHz 4 Core HT), WD VelociRaptor 
(300G 10krpm 8ms), msysgit 1.8.0 + preload-index patch + fscache patch):

MAX_PARALLEL | cold cache | warm cache
-++
no preload   | 49.938 | 9.204 (*)
   1 | 45.412 | 1.622
   2 | 55.334 | 1.123
   3 | 65.973 | 0.982
   4 | 67.579 | 0.889
   5 | 76.159 | 0.827
   6 | 81.510 | 0.811
   7 | 86.269 | 0.858
   8 | 85.862 | 0.827
...
  10 | 87.953 | 0.717
...
  20 |176.251 | 0.749

(*) core.preloadindex currently also disables fscache, thus the 9s

With more threads, Windows resource monitor also shows increasing disk 
queue length and response times.

It seems clear that more concurrent disk seeks hurt cold cache performance 
pretty badly. On the other hand, warm cache improvements for #threads  
#cores are only about 10 - 20%.

I don't know if Linux is better at caching / prefetching directory 
listings (might depend on file system, too), but perhaps MAX_PARALLEL 
should be set to a more reasonable value, or be made configurable (e.g. 
core.preloadIndexThreads)?

Karsten
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update-index/diff-index: use core.preloadindex to improve performance

2012-11-13 Thread karsten . blees
Jeff King p...@peff.net wrote on 02.11.2012 16:38:00:

 On Fri, Nov 02, 2012 at 11:26:16AM -0400, Jeff King wrote:
 
  Still, I don't think we need to worry about performance regressions,
  because people who don't have a setup suitable for it will not turn on
  core.preloadindex in the first place. And if they have it on, the more
  places we use it, probably the better.
 
 BTW, your patch was badly damaged in transit (wrapped, and tabs
 converted to spaces). I was able to fix it up, but please check your
 mailer's settings.
 

Yes, I feared as much, that's why I included the pull URL (the company MTA 
only accepts outbound mail from Notes clients, sorry).

Is there a policy for people with broken mailers (send patch as 
attachment, add pull URL more prominently, don't include plaintext patch 
at all...)?

--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update-index/diff-index: use core.preloadindex to improve performance

2012-11-13 Thread Junio C Hamano
karsten.bl...@dcon.de writes:

 Jeff King p...@peff.net wrote on 02.11.2012 16:38:00:

 On Fri, Nov 02, 2012 at 11:26:16AM -0400, Jeff King wrote:
 
  Still, I don't think we need to worry about performance regressions,
  because people who don't have a setup suitable for it will not turn on
  core.preloadindex in the first place. And if they have it on, the more
  places we use it, probably the better.
 
 BTW, your patch was badly damaged in transit (wrapped, and tabs
 converted to spaces). I was able to fix it up, but please check your
 mailer's settings.
 

 Yes, I feared as much, that's why I included the pull URL (the company MTA 
 only accepts outbound mail from Notes clients, sorry).

 Is there a policy for people with broken mailers (send patch as 
 attachment, add pull URL more prominently, don't include plaintext patch 
 at all...)?

If anything, fix your mailer probably is the policy you are
looking for, I think.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update-index/diff-index: use core.preloadindex to improve performance

2012-11-13 Thread Karsten Blees
Am 13.11.2012 17:46, schrieb Junio C Hamano:
 karsten.bl...@dcon.de writes:
 
 If anything, fix your mailer probably is the policy you are
 looking for, I think.

Well then...I've cloned myself @gmail, I hope this is better.

Just some provoking thoughts...(if I may):

RFC-5322 recommends wrapping lines at 78, and mail relays and gateways are 
allowed to change message content according to the capabilities of the receiver 
(RFC-5598). In essence, plaintext mail is completely unsuitable for 
preformatted text such as source code.

On the other hand, git tries to solve the very problem of distributed source 
code management, and consistency by strong sha-1 checksums is on the top of the 
feature list.

It is somehow hard to believe that contributing to git itself should only be 
possible using the most unreliable of protocols. Don't you trust your own 
software?


-- 8 --
Subject: [PATCH] update-index/diff-index: use core.preloadindex to improve 
performance

'update-index --refresh' and 'diff-index' (without --cached) don't honor
the core.preloadindex setting yet. Porcelain commands using these (such as
git [svn] rebase) suffer from this, especially on Windows.

Use read_cache_preload to improve performance.

Additionally, in builtin/diff.c, don't preload index status if we don't
access the working copy (--cached).

Results with msysgit on WebKit repo (2GB in 200k files):

| update-index | diff-index | rebase
+--++-
msysgit-v1.8.0  |   9.157s |10.536s | 42.791s
+ preloadindex  |   9.157s |10.536s | 28.725s
+ this patch|   2.329s | 2.752s | 15.152s
+ fscache [1]   |   0.731s | 1.171s |  8.877s

[1] https://github.com/kblees/git/tree/kb/fscache-v3

Thanks-to: Albert Krawczyk pro-lo...@optusnet.com.au
Signed-off-by: Karsten Blees bl...@dcon.de
---

Can also be pulled from: 
https://github.com/kblees/git/tree/kb/update-diff-index-preload-upstream

More performance figures (for msysgit) can be found in this discussion: 
https://github.com/pro-logic/git/commit/32c03dd8


 builtin/diff-index.c   |  8 ++--
 builtin/diff.c | 12 
 builtin/update-index.c |  1 +
 3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/builtin/diff-index.c b/builtin/diff-index.c
index 2eb32bd..1c737f7 100644
--- a/builtin/diff-index.c
+++ b/builtin/diff-index.c
@@ -41,9 +41,13 @@ int cmd_diff_index(int argc, const char **argv, const char 
*prefix)
if (rev.pending.nr != 1 ||
rev.max_count != -1 || rev.min_age != -1 || rev.max_age != -1)
usage(diff_cache_usage);
-   if (!cached)
+   if (!cached) {
setup_work_tree();
-   if (read_cache()  0) {
+   if (read_cache_preload(rev.diffopt.pathspec.raw)  0) {
+   perror(read_cache_preload);
+   return -1;
+   }
+   } else if (read_cache()  0) {
perror(read_cache);
return -1;
}
diff --git a/builtin/diff.c b/builtin/diff.c
index 9650be2..198b921 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -130,8 +130,6 @@ static int builtin_diff_index(struct rev_info *revs,
usage(builtin_diff_usage);
argv++; argc--;
}
-   if (!cached)
-   setup_work_tree();
/*
 * Make sure there is one revision (i.e. pending object),
 * and there is no revision filtering parameters.
@@ -140,8 +138,14 @@ static int builtin_diff_index(struct rev_info *revs,
revs-max_count != -1 || revs-min_age != -1 ||
revs-max_age != -1)
usage(builtin_diff_usage);
-   if (read_cache_preload(revs-diffopt.pathspec.raw)  0) {
-   perror(read_cache_preload);
+   if (!cached) {
+   setup_work_tree();
+   if (read_cache_preload(revs-diffopt.pathspec.raw)  0) {
+   perror(read_cache_preload);
+   return -1;
+   }
+   } else if (read_cache()  0) {
+   perror(read_cache);
return -1;
}
return run_diff_index(revs, cached);
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 74986bf..ada1dff 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -593,6 +593,7 @@ struct refresh_params {
 static int refresh(struct refresh_params *o, unsigned int flag)
 {
setup_work_tree();
+   read_cache_preload(NULL);
*o-has_errors |= refresh_cache(o-flags | flag);
return 0;
 }
-- 
1.8.0.msysgit.0.3.g7d9d98c
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update-index/diff-index: use core.preloadindex to improve performance

2012-11-02 Thread Jeff King
On Tue, Oct 30, 2012 at 10:50:42AM +0100, karsten.bl...@dcon.de wrote:

 'update-index --refresh' and 'diff-index' (without --cached) don't honor
 the core.preloadindex setting yet. Porcelain commands using these (such as
 git [svn] rebase) suffer from this, especially on Windows.
 
 Use read_cache_preload to improve performance.
 
 Additionally, in builtin/diff.c, don't preload index status if we don't
 access the working copy (--cached).
 
 Results with msysgit on WebKit repo (2GB in 200k files):
 
 | update-index | diff-index | rebase
 +--++-
 msysgit-v1.8.0  |   9.157s |10.536s | 42.791s
 + preloadindex  |   9.157s |10.536s | 28.725s
 + this patch|   2.329s | 2.752s | 15.152s
 + fscache [1]   |   0.731s | 1.171s |  8.877s

Cool numbers. On my quad-core SSD Linux box, I saw a few speedups, too.
Here are the numbers for update-index --refresh on the WebKit repo
(all are wall clock time, best-of-five):

 | before | after
  ---++
  cold cache | 4.513s | 2.059s
  warm cache | 0.252s | 0.164s

Not as dramatic, but still nice. I wonder how a spinning disk would fare
on the cold-cache case, though.  I also tried it with all but one CPU
disabled, and the warm cache case was a little bit slower.

Still, I don't think we need to worry about performance regressions,
because people who don't have a setup suitable for it will not turn on
core.preloadindex in the first place. And if they have it on, the more
places we use it, probably the better.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update-index/diff-index: use core.preloadindex to improve performance

2012-11-02 Thread Jeff King
On Fri, Nov 02, 2012 at 11:26:16AM -0400, Jeff King wrote:

 Still, I don't think we need to worry about performance regressions,
 because people who don't have a setup suitable for it will not turn on
 core.preloadindex in the first place. And if they have it on, the more
 places we use it, probably the better.

BTW, your patch was badly damaged in transit (wrapped, and tabs
converted to spaces). I was able to fix it up, but please check your
mailer's settings.

-Peff
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH] update-index/diff-index: use core.preloadindex to improve performance

2012-10-30 Thread karsten . blees
'update-index --refresh' and 'diff-index' (without --cached) don't honor
the core.preloadindex setting yet. Porcelain commands using these (such as
git [svn] rebase) suffer from this, especially on Windows.

Use read_cache_preload to improve performance.

Additionally, in builtin/diff.c, don't preload index status if we don't
access the working copy (--cached).

Results with msysgit on WebKit repo (2GB in 200k files):

| update-index | diff-index | rebase
+--++-
msysgit-v1.8.0  |   9.157s |10.536s | 42.791s
+ preloadindex  |   9.157s |10.536s | 28.725s
+ this patch|   2.329s | 2.752s | 15.152s
+ fscache [1]   |   0.731s | 1.171s |  8.877s

[1] https://github.com/kblees/git/tree/kb/fscache-v3

Thanks-to: Albert Krawczyk pro-lo...@optusnet.com.au
Signed-off-by: Karsten Blees bl...@dcon.de
---

Can also be pulled from: 
https://github.com/kblees/git/tree/kb/update-diff-index-preload-upstream

I thought I might send this upstream directly, as its not msysgit related. 
More performance figures (for msysgit) can be found in this discussion: 
https://github.com/pro-logic/git/commit/32c03dd8

Ciao,
Karsten

 builtin/diff-index.c   |  8 ++--
 builtin/diff.c | 12 
 builtin/update-index.c |  1 +
 3 files changed, 15 insertions(+), 6 deletions(-)

diff --git a/builtin/diff-index.c b/builtin/diff-index.c
index 2eb32bd..1c737f7 100644
--- a/builtin/diff-index.c
+++ b/builtin/diff-index.c
@@ -41,9 +41,13 @@ int cmd_diff_index(int argc, const char **argv, const 
char *prefix)
if (rev.pending.nr != 1 ||
rev.max_count != -1 || rev.min_age != -1 || rev.max_age != -1)
usage(diff_cache_usage);
-   if (!cached)
+   if (!cached) {
setup_work_tree();
-   if (read_cache()  0) {
+   if (read_cache_preload(rev.diffopt.pathspec.raw)  0) {
+   perror(read_cache_preload);
+   return -1;
+   }
+   } else if (read_cache()  0) {
perror(read_cache);
return -1;
}
diff --git a/builtin/diff.c b/builtin/diff.c
index 9650be2..198b921 100644
--- a/builtin/diff.c
+++ b/builtin/diff.c
@@ -130,8 +130,6 @@ static int builtin_diff_index(struct rev_info *revs,
usage(builtin_diff_usage);
argv++; argc--;
}
-   if (!cached)
-   setup_work_tree();
/*
 * Make sure there is one revision (i.e. pending object),
 * and there is no revision filtering parameters.
@@ -140,8 +138,14 @@ static int builtin_diff_index(struct rev_info *revs,
revs-max_count != -1 || revs-min_age != -1 ||
revs-max_age != -1)
usage(builtin_diff_usage);
-   if (read_cache_preload(revs-diffopt.pathspec.raw)  0) {
-   perror(read_cache_preload);
+   if (!cached) {
+   setup_work_tree();
+   if (read_cache_preload(revs-diffopt.pathspec.raw)  0) {
+   perror(read_cache_preload);
+   return -1;
+   }
+   } else if (read_cache()  0) {
+   perror(read_cache);
return -1;
}
return run_diff_index(revs, cached);
diff --git a/builtin/update-index.c b/builtin/update-index.c
index 74986bf..ada1dff 100644
--- a/builtin/update-index.c
+++ b/builtin/update-index.c
@@ -593,6 +593,7 @@ struct refresh_params {
 static int refresh(struct refresh_params *o, unsigned int flag)
 {
setup_work_tree();
+   read_cache_preload(NULL);
*o-has_errors |= refresh_cache(o-flags | flag);
return 0;
 }
-- 
1.8.0.msysgit.0.3.g7d9d98c
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH] update-index/diff-index: use core.preloadindex to improve performance

2012-10-30 Thread Erik Faye-Lund
On Tue, Oct 30, 2012 at 10:50 AM,  karsten.bl...@dcon.de wrote:
 'update-index --refresh' and 'diff-index' (without --cached) don't honor
 the core.preloadindex setting yet. Porcelain commands using these (such as
 git [svn] rebase) suffer from this, especially on Windows.

 Use read_cache_preload to improve performance.

 Additionally, in builtin/diff.c, don't preload index status if we don't
 access the working copy (--cached).

 Results with msysgit on WebKit repo (2GB in 200k files):

 | update-index | diff-index | rebase
 +--++-
 msysgit-v1.8.0  |   9.157s |10.536s | 42.791s
 + preloadindex  |   9.157s |10.536s | 28.725s
 + this patch|   2.329s | 2.752s | 15.152s
 + fscache [1]   |   0.731s | 1.171s |  8.877s


Wow, awesome results :)

This also makes me want to play around with the fscache stuff a bit;
about an order of magnitude improvement is quite noticeable :)
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html