Since I reboot fairly regularly to test new kernels, I don't *always*
have the kernel source tree in my caches, so I still care about some
cold-cache performance. And "git grep" tends to be the most noticeable

Now, I have a SSD, and even the cold-cache case takes just five
seconds or so, but that's still somethng I react to, since the normal
"kernel tree in cache" case ends up bring close enough to
instantaneous (half a second) that then when it takes longer I react
to it.

And I started thinking about it, and our "git grep" parallelism seems
to be limited to 8.

Which is fine most of the time for CPU parallelism (although maybe
some people with big machines would prefer higher numbers), but for IO
parallelism I thought that maybe we'd like a higher number...

So I tried it out, and with THREADS set to 32, I get a roughly 15%
performance boost for the cold-cache case (the error bar is high
enough to not give a very precise number, but I see it going from
~4.8-4.9s on my machine down to 4.2..4.6s).

That's on an SSD, though - the performance implications might be very
different for other use cases (NFS would likely prefer higher IO
parallelism and might show bigger improvement, while a rotational disk
might end up just thrashing more)

Is this a big deal? Probably not. But I did react to how annoying it
was to set the parallelism factor (recompile git with a new number).
Wouldn't it be lovely if it was slightly smarter (something more akin
to the index preloading that takes number of files into account) or at
least allowed people to set the parallelism explicitly with a command
line switch?

Right now it disables the parallel grep entirely for UP, for example.
Which makes perfect sense if grep is all about CPU use. But even UP
might improve from parallel IO..

To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to
More majordomo info at

Reply via email to