Re: [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files

2013-02-17 Thread Pete Wyckoff
pclo...@gmail.com wrote on Sun, 17 Feb 2013 11:39 +0700:
 On Sun, Feb 17, 2013 at 1:11 AM, Pete Wyckoff p...@padd.com wrote:
  pclo...@gmail.com wrote on Sat, 16 Feb 2013 14:17 +0700:
  Finally some numbers (best of 20 runs) that shows why it's worth all
  the hassle:
 
  git status   | webkit linux-2.6 libreoffice-core gentoo-x86
  -+--
  before   | 1.097s0.208s   0.399s 0.539s
  after| 0.736s0.159s   0.248s 0.501s
  nr. patterns |89   376   19  0
  nr. tracked  |   182k   40k  63k   101k
 
  Thanks for this work.  I repeated some of the tests across NFS,
  where I'd expect to see bigger differences.
 
 This is about reducing CPU processing time, not I/O time. So no bigger
 differences is expected. I/O time can be reduced with inotify, or fam
 in nfs case because inotify does not support nfs.

Numbers from the last mail were core.preloadindex=true.  Here's
time output from average runs:

stock = 0m2.28s user 0m4.18s sys 0m11.28s elapsed 57.39 %CPU
duy   = 0m1.25s user 0m4.43s sys 0m7.45s elapsed 76.41 %CPU

With this huge repo, preloadindex may be stressing directory
cache behavior on the NFS server or client.  Your patch helps
both CPU and wait time by avoiding the 6000-odd open() of
non-existent .gitignore.

With core.preloadindex=false, it's a 1 sec speedup, all from CPU:

stock = 0m2.18s user 0m1.59s sys 0m7.78s elapsed 48.45 %CPU
duy   = 0m1.17s user 0m1.63s sys 0m6.91s elapsed 40.59 %CPU


-- Pete
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files

2013-02-17 Thread Junio C Hamano
Nguyễn Thái Ngọc Duy  pclo...@gmail.com writes:

 If path_handled is returned, contents goes up. And if check_only is
 true, the loop could be broken early. These will not happen when
 treat_one_path (and its wrapper treat_path) returns
 path_ignored. dir_add_name internally does a cache_name_exists() check
 so it makes no difference.

 To avoid this behavior change, treat_one_path is instructed to skip
 the optimization when check_only or contents is used.

OK, that makes more understandable why this is safe.

 @@ -1242,9 +1246,23 @@ enum path_treatment {
  static enum path_treatment treat_one_path(struct dir_struct *dir,
 struct strbuf *path,
 const struct path_simplify *simplify,
 -   int dtype, struct dirent *de)
 +   int dtype, struct dirent *de,
 +   int exclude_shortcut_ok)
  {
 ... 
 @@ -1331,18 +1347,29 @@ static int read_directory_recursive(struct dir_struct 
 *dir,
   goto out;
  
   while ((de = readdir(fdir)) != NULL) {
 - switch (treat_path(dir, de, path, baselen, simplify)) {
 + switch (treat_path(dir, de, path, baselen,
 +simplify,
 +!check_only  !contents)) {
 ...

Between these two places we may want to say what kind of short-cut
we are talking about, but there only is one kind of short-cut at
this moment, so let's leave that to other people who want to further
optimize things in this codepath by adding other short-cuts.

Thanks; will queue.
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files

2013-02-16 Thread Pete Wyckoff
pclo...@gmail.com wrote on Sat, 16 Feb 2013 14:17 +0700:
 Finally some numbers (best of 20 runs) that shows why it's worth all
 the hassle:
 
 git status   | webkit linux-2.6 libreoffice-core gentoo-x86
 -+--
 before   | 1.097s0.208s   0.399s 0.539s
 after| 0.736s0.159s   0.248s 0.501s
 nr. patterns |89   376   19  0
 nr. tracked  |   182k   40k  63k   101k

Thanks for this work.  I repeated some of the tests across NFS,
where I'd expect to see bigger differences.  Best of 20 values
reported in min 

webkit
Stock min  9.61 avg 11.61 +/- 1.35 max 14.26
Duy   min  6.91 avg  7.67 +/- 0.46 max  8.71

linux
Stock min  2.27 avg  3.16 +/- 0.56 max  4.49
Duy   min  2.04 avg  3.12 +/- 0.69 max  4.87

libreoffice-core
Stock min  4.56 avg  5.79 +/- 0.79 max  7.08
Duy   min  3.96 avg  5.25 +/- 0.95 max  6.95

Similar 30%-ish speedup on webkit.  And an absolute gain
of 2.7 seconds is quite nice.

-- Pete
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH v2] read_directory: avoid invoking exclude machinery on tracked files

2013-02-16 Thread Duy Nguyen
On Sun, Feb 17, 2013 at 1:11 AM, Pete Wyckoff p...@padd.com wrote:
 pclo...@gmail.com wrote on Sat, 16 Feb 2013 14:17 +0700:
 Finally some numbers (best of 20 runs) that shows why it's worth all
 the hassle:

 git status   | webkit linux-2.6 libreoffice-core gentoo-x86
 -+--
 before   | 1.097s0.208s   0.399s 0.539s
 after| 0.736s0.159s   0.248s 0.501s
 nr. patterns |89   376   19  0
 nr. tracked  |   182k   40k  63k   101k

 Thanks for this work.  I repeated some of the tests across NFS,
 where I'd expect to see bigger differences.

This is about reducing CPU processing time, not I/O time. So no bigger
differences is expected. I/O time can be reduced with inotify, or fam
in nfs case because inotify does not support nfs.

 Best of 20 values reported in min 

 webkit
 Stock min  9.61 avg 11.61 +/- 1.35 max 14.26
 Duy   min  6.91 avg  7.67 +/- 0.46 max  8.71

 linux
 Stock min  2.27 avg  3.16 +/- 0.56 max  4.49
 Duy   min  2.04 avg  3.12 +/- 0.69 max  4.87

 libreoffice-core
 Stock min  4.56 avg  5.79 +/- 0.79 max  7.08
 Duy   min  3.96 avg  5.25 +/- 0.95 max  6.95

 Similar 30%-ish speedup on webkit.  And an absolute gain
 of 2.7 seconds is quite nice.

 -- Pete



-- 
Duy
--
To unsubscribe from this list: send the line unsubscribe git in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html