On Wed, 2014-05-28 at 23:44 +0200, Michael Haggerty wrote: > On 05/28/2014 11:04 PM, David Turner wrote: > > In a repository with tens of thousands of refs, the command > > ~/git/git-diff-index --cached --quiet --ignore-submodules [revision] > > is a bit slow. check_refname_component is a major contributor to the > > runtime. Provide two optimized versions of this function: one for > > machines with SSE4.2, and one for machines without. The speedup for > > this command on one particular repo (with about 60k refs) is about 12% > > for the SSE version and 8% for the non-SSE version. > > Interesting. Thanks for the patch.
Thanks for your thoughtful comments. > Why do you use "git diff-index" to benchmark your change? Is there > something particular about that command that leads to especially bad > performance with lots of references? Because a colleague specifically asked me to look at it because he uses it as part of the zsh/bash git prompt dirty bit display. But that is not actually a good reason to use it in the commit-message. This is also the reason why milliseconds matter, although at present other parts of git are slow enough that the prompt stuff is still somewhat infeasible for large repos. > I assume that most of the time spent in check_refname_component() is > while reading the packed-refs file, right? Yes. > If so, there are probably > other git commands that would better measure that time, with less other > work. For example, "git rev-parse refname" will read the packed-refs > file, too (assuming that "refname" is not a loose reference). If you > test the speedup of a cheaper command like that, it seems to me that you > will get a better idea of how much you are speeding up the ref-reading > part. It would also be interesting to see the difference in > milliseconds (rather than a percentage) for some specified number of > references. I'll change it to rev-parse and to show the difference in milliseconds. The times on rev-parse are 35/29/25ms on one particular computer, for original, LUT, SSE. So, somewhat larger speedup in percentage terms. I should also note that this represents a real use-case, and I expect that the number of refs will be somewhat larger in the future. > I think it's worth using your LUT-based approach to save 8%. I'm less > sure it's worth the complication and unreadability of using SSE to get > the next 4% speedup. But the gains might be proven to be more > significant if you benchmark them differently. I was actually hoping at some point to use SSE to speed up a small few of the other slow bits e.g. get_sha1_hex. I have not yet tested if this will actually be faster, but I bet it will. And my watchman branch uses it to speed up the hashmap (but it seems CityHash works about as well so I might just port that instead). But actually speaking of SSE: is there a minimum compiler version for git? Because I have just discovered gcc-4.2 on the Mac has a bug which causes this code to misbehave. Yet again, compiler intrinsics prove less portable than straight assembly language. I would be just as happy to write it in assembly, but I suspect that nobody would enjoy that. I could also work around the bug with a compiler-specific ifdef, but Apple has been shipping clang for some years now, so I think it's OK to leave the code as-is. > I won't critique the code in detail until we have thrashed out the > metaissues, but I made a few comments below about nits that I noticed > when I scanned through. I'll go ahead and roll a new version with the nits picked. > Please add a comment here about what the values in refname_disposition > signify. And maybe add some line comments explaining unusual values, > for people who haven't memorized the ASCII encoding; e.g., on the third > line, I think I'm going to say, "see below for the list of illegal characters, from which this table is derived.", if that's OK with you. -- To unsubscribe from this list: send the line "unsubscribe git" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html