On Wed, Mar 27, 2013 at 12:04 PM, Jeff King <p...@peff.net> wrote:
> Yes, I think that's pretty much the case (though most of my
> Git-on-Windows experience is from cygwin long ago, where the stat
> performance was truly horrendous). Have you tried setting
> core.preloadindex, which should run the stats in parallel?

I wonder if preloadindex shouldn't be enabled by default.. It's a huge
deal on NFS, and the only real downside is that it expects threading
to work. It potentially slows things down a tiny bit for single-CPU
cases with everything cached, but that isn't likely to be a relevant

Of course, it can trigger filesystem scalability issues, and as a
result it will often not help very much if you have the bulk of your
files in one (or a few) directories. But anybody who has so many files
that performance is an issue is not likely to have them all in one

And apparently the Windows FS metadata caching sucks, and things fall
out of the cache for large trees. Color me not-very-surprised. It's
probably some size limit on the metadata that you can tweak. So I';m
sure there's some registry setting or other that would make windows
able to cache more than a few thousand filenames, and it would
probably improve performance a lot, but I do think preloadindex has
been around long enough that it could just be the default.

Of course, Jim should verify that preloadindex actually does solve his
problem.  With 20k+ files, it should max out the 20 IO threads for
preloading, and assuming the filesystem IO scales reasonably well, it
should fix the problem. But we do do a number of metadata ops
synchronously even with preloadindex, so things won't scale perfectly.

(In particular: do open each directory and do the readdir stuff and
try to open .gitignore whether it exists or not. So you'll get
synchronous IO for each directory, but at least the per-file IO to
check all the file stat data should scale).

