I posted this to comp.version-control.git.user and didn't get any response. I
think the question is plumbing-related enough that I can ask it here.
I'm evaluating the feasibility of moving my team from SVN to git. We have a very
large repo. [1] We will have a central repo using GitLab (or similar) that
everybody works with. Forks, code sharing, pull requests etc. will be done
through this central server.
By 'performance', I guess I mean speed of day to day operations for devs.
* (Obviously, trivially, a (non-local) clone will be slow with a large repo.)
* Will a few simultaneous clones from the central server also slow down
other concurrent operations for other users?
* Will 'git pull' be slow?
* 'git push'?
* 'git commit'? (It is listed as slow in reference [3].)
* 'git stautus'? (Slow again in reference 3 though I don't see it.)
* Some operations might not seem to be day-to-day but if they are called
frequently by the web front-end to GitLab/Stash/GitHub etc then
they can become bottlenecks. (e.g. 'git branch --contains' seems terribly
adversely affected by large numbers of branches.)
* Others?
Assuming I can put lots of resources into a central server with lots of CPU,
RAM, fast SSD, fast networking, what aspects of the repo are most likely to
affect devs' experience?
* Number of commits
* Sheer disk space occupied by the repo
* Number of tags.
* Number of branches.
* Binary objects in the repo that cause it to bloat in size [1]
* Other factors?
Of the various HW items listed above --CPU speed, number of cores, RAM, SSD,
networking-- which is most critical here?
(Stash recommends 1.5 x repo_size x number of concurrent clones of
available RAM.
I assume that is good advice in general.)
Assume ridiculous numbers. Let me exaggerate: say 1 million commits, 15 GB repo,
50k tags, 1,000 branches. (Due to historical code fixups, another 5,000 "fix-up
branches" which are just one little dangling commit required to change the code
a little bit between a commit a tag that was not quite made from it.)
While there's lots of information online, much of it is old [3] and with git
constantly evolving I don't know how valid it still is. Then there's anecdotal
evidence that is of questionable value.[2]
Are many/all of the issues Facebook identified [3] resolved? (Yes, I
understand Facebook went with Mercurial. But I imagine the git team nevertheless
took their analysis to heart.)
Thanks,
Steve
[1] (Yes, I'm investigating ways to make our repo not so large etc. That's
beyond the scope of the discussion I'd like to have with this
question. Thanks.)
[2] The large amounts of anecdotal evidence relate to the "why don't you try it
yourself?" response to my question. I will I I have to but setting up a
properly methodical study is time consuming and difficult --I don't want to
produce poor anecdotal numbers that don't really hold up-- and if somebody's
already done the work, then I should leverage it.
[3] http://thread.gmane.org/gmane.comp.version-control.git/189776
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to [email protected]
More majordomo info at http://vger.kernel.org/majordomo-info.html