Anders Björklund wrote: > Michael Kolomeytsev wrote: >> I've discovered that there is too small buffer size for IO in ccache: 16k >> or 10k >> (in hash_fd, copy_fd, copy_file). >> > > But your observations are very interesting, and please post > more if you have it. Would also be nice to have some follow-up > on the observation about ccache problems with multiple cores: > https://github.com/jrosdahl/ccache/issues/54 (also on OS X) > > I'm thinking that hash and copy could do with different macros...
Actually three macros, hash, compress/decompress and plain old copy. Thought I'd move the "copy" case aside, away from the other buffers... You'd think that copying a file would be a simple thing to do, right ? Actually, on some systems like Windows or Mac OS X it is. But on Linux: Found this interesting blog post, that came with some benchmarks too: http://blog.plenz.com/2014-04/so-you-want-to-write-to-a-file-real-fast.html So the first thing to do would be to make the I/O buffer size into a whole multiple of the block size, that is: 16384 instead of 10240. Avoids having to do partial page copies later. And then allocating the buffer in kernel space instead of user space sounded like a good idea. But having to look for various OS/kernel versions of sendfile()? Eww. Might as well stick with "splice()", since other main systems like have solutions already: Win32 have CopyFile and OS X has copyfile. And doing some "advise/allocate" sounded easy, but had pitfalls too. Here is the end result, in case anyone is interested in a preview: https://github.com/jrosdahl/ccache/compare/master...itensionanders:uncompressed It sounded like a good idea, but needs some actual benchmarks to see whether it was actually worth it. Probably should check st_blksize too. The actual I/O can probably be made twice as fast (e.g. for a 1M file) Question is whether it makes any real impact of the ccache run time ? pipe+splice + advices + trunc 1175ns 1283ns 1290ns read+write 4bs 1537ns 2126ns 2210ns (+ 30.8%) read+write 10k 2334ns 2356ns 2668ns (+ 98.6%) read+write bs 2515ns 2692ns 4591ns (+ 114.0%) But 256K seemed like overkill (over 16K), at least for plain copy I/O. Might still be some additional benefits when doing gzip or md4, though. /Anders PS. We gave up on mmap already, for other reasons (high maintenance) https://github.com/jrosdahl/ccache/commit/c358e7c801e265ce07e909d75f3f3fd4e16c7f65 _______________________________________________ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache