On Tue, May 11, 2010 at 07:43:33AM -0400, Mark Phippard wrote: > On Tue, May 11, 2010 at 7:27 AM, Stefan Sperling <s...@elego.de> wrote: > > On Tue, May 11, 2010 at 01:36:26AM +0200, Johan Corveleyn wrote: > >> As I understand your set of patches, you're mainly focusing on saving > >> cpu cycles, and not on avoiding I/O where possible (unless I'm missing > >> something). Maybe some of the low- or high-level algorithms in the > >> back-end can be reworked a bit to reduce the amount of I/O? Or maybe > >> some clever caching can avoid some file accesses? > > > > In general, I think trying to work around I/O slowness by loading > > stuff into RAM (caching) is a bad idea. You're just taking away memory > > from the OS buffer cache if you do this. A good buffer cache in the OS > > should make open/close/seek fast. (So don't run a windows server if > > you can avoid it.) > > > > The only point where it's worth thinking about optimizing I/O > > access is when you get to clustered, distributed storage, because > > at that point every I/O request translated into a network packet. > > You had me until that last part. I think we should ALWAYS be thinking > about optimizing I/O. I have little doubt that is where the biggest > performance bottlenecks live (other than network of course). I agree > that making a big cache is probably not the best way to go, but I > think we should always be looking for optimizations where we avoid > repeated open/closes that are not necessary.
That's true. Avoiding repeated open/close of the same file is a good optimisation. Even with a good buffer cache it will make a difference. So s/The only point/One point/ :) > I think it is extremely common that our customers have their > repositories on NFS-mounted or SAN storage. While these often have > fast disk subsystems there is still a noticeable penalty for file > opens. Have you looked at Blair's wiki before? > > http://www.orcaware.com/svn/wiki/Server_performance_tuning_for_Linux_and_Unix Thanks, that was an interesting read. Of course, network filesystems like NFS have the same network overhead penalty (except that caching on the local client is probably a bit easier than with truly distributed storage, but that's a minor detail). Stefan