Hi all, Getting back to this oldish thread ...
I have re-analyzed my old test figures, and done some additional tests. It seems I was mistaken with some of my conclusions (pointing to I/O instead of CPU), so apologies to Stefan^2 and others for "wasted cycles" in this discussion. I see now that you are mostly correct, Stefan, by focusing on optimizing for CPU usage, and I hope you can save as many cycles as possible ;-). The only use case where I definitely see a server-I/O-boundedness is log (see below). All the other actions I tested (update, checkout, blame) were more sensitive to CPU than to I/O. I thought blame was also server-I/O-bound, but it is not (it is both client-side and server-side CPU bound). I tested by comparing 5 setups (2 machines, different storage): 1) Sun Sparc box (Sol 10), 32 x 1.2 GHz, svn 1.5.4, FSFS back-end on SAN/NFS (current production setup) 2) Sun Sparc box (Sol 10), 32 x 1.2 GHz, svn 1.5.4, FSFS back-end on local 10k disk 3) x86 box (Win Vista), 3.0 GHz Core2Duo, svn 1.6.9, FSFS back-end unpacked on local 10k disk 4) x86 box (Win Vista), 3.0 GHz Core2Duo, svn 1.6.9, FSFS back-end packed on local 10k disk 5) x86 box (Win Vista), 3.0 GHz Core2Duo, svn 1.6.9, FSFS back-end packed on local SSD disk (the machine used for 3), 4) and 5) is not really a proper server (not suitable for huge loads, lots of concurrent requests etc, ...), but I just used it to compare some things) By comparing those 5 setups, I could see which had the most impact for a particular use case: changing storage for the same machine, or changing to a different cpu/OS, same storage. A crude way to test, but still it gave me some indications. Blame: - Almost no difference between 1) and 2). - Huge difference (4 or 5 times faster) by switching to setups 3), 4) or 5) (not much difference between them). - Conclusion: cpu-bound Log: - Big difference between 1) and 2) (2,4 times faster) - 3) and 4) perform almost identically to 2). - Huge difference between 2),3),4) on the one hand, and 5) on the other (SSD almost 4 times faster than 10k disk). - Conclusion: I/O-bound About log, some inline replies below ... On Sun, May 16, 2010 at 12:29 PM, Stefan Fuhrmann <stefanfuhrm...@alice-dsl.de> wrote: [snip] > Johan Corveleyn wrote: [snip] >> I mainly focused on log and blame (and checkout/update to a lesser >> degree), so that may be one of the reasons why we're seeing it >> differently :-) . I suppose the numbers, bottlenecks, ... totally >> depend on the use case (as well as the hardware/network setup). >> > > The log performance issue has been solved more or less > in TSVN. In 1.7, we also brought the UI up to speed > with the internals: even complex full-text searches over > millions of changes are (almost) interactive. Yeah, I know TSVN has worked around the slowness of log, and that's great. But still, I would like log to be fast in svn core as well. Sometimes a build script or whatever needs to retrieve some log info with the CLI, or your particular IDE integration doesn't have the option of only asking the last 100 log entries, and caching stuff client-side etc. And besides, maybe it would make TSVN's log even faster :-). > To speed up log on the server side, you need to maintain > an index. That's certainly not going to happen before fs-ng. > Otherwise, you will always end up reading every revision file. > Only exception: log on the repo root with no changed path > listing. Yes, the best solution would indeed be more/better indexing in the back-end storage, and I understand that will not happen anytime soon. However, even with the current proverbial "full table scan", improvements can be made I think. Right now, I have the impression that svn "scans the table" about 5 times (see old threads [1] and [2], and recent discussion [3]). A single table scan should suffice :-). But that's probably for another thread... [1] http://svn.haxx.se/dev/archive-2009-06/0459.shtml [2] http://svn.haxx.se/dev/archive-2007-08/0239.shtml [3] http://svn.haxx.se/dev/archive-2010-05/0153.shtml (ignore the performance figures in that thread for all actions except for log; this was basically a comparison between setups 1) and 5) from the above list, which is not so smart considering they differ both in storage and in CPU/architecture/OS/...) Cheers, -- Johan