Re: [ccache] Stumbling blocks with ccache and embedded/encapsulated environments
On Wed, Dec 1, 2010 at 9:00 PM, Martin Pool m...@canonical.com wrote: On 11 November 2010 10:56, Christopher Tate ct...@google.com wrote: I don't want to rain on peoples' parade here, because ccache is a great product that has real benefits, but I do want to share some of our findings regarding the use of ccache in our very large product -- we were surprised by them, and you may be as well. These findings are specifically for *large products*. In our case, the total source code file size is on the order of 3 gigabytes (which includes not only C/C++ but also Java source files, a couple hundred thousand lines of makefiles, etc). It's the Android mobile phone OS, fwiw: it builds something like 1-2 gigabytes of .o files from C/C++ during a full build, and does a ton of Java compilation, resource compilation, Dalvik compilation, etc as well. I'd love to know whether you also tried distcc for it, and if so what happened or what went wrong. (Obviously it can only help for the C/C++ phases.) distcc can certainly help a great deal. For us, it's a bit problematic to use because more than half of our total build is non-C/C++ that depends on the C/C++ targets [e.g. Java-language modules that have partially native implementations], plus we have a highly heterogeneous set of build machines: both Mac hosts and Linux, not all the same distro of Linux, etc. The inclusion of Macs in particular makes distcc more of a pain to get up and running cleanly. The issue is around VM/file system buffer cache management. If you're using ccache, then you'll effectively be doubling the number of .o files that are paged into memory during the course of a build. I'm just trying to understand how this happens. Is it that when ccache misses it writes out an object file both to the cache directory and into the build directory, and both will be in the buffer cache? So it's not so much they're paged in, but they are dirtied in memory and will still be held there. Even on a ccache *hit* both copies of the .o file wind up occupying buffer cache space, because the ccached .o is read from disk [paging it in] in order to write the .o file to the build output directory. On a ccache miss the copy runs the other direction but you still wind up with both sets of pages in the buffer cache. It seems like turning on compression would reduce the effect. At the expense of the extra cpu time, sure. That might be a decent tradeoff; modern cpus are getting quite fast relative to I/O. Turning on hardlinking might eliminate it altogether, though that could have other bad effects. Right. We haven't tried pursuing this because for other reasons the marginal returns are still pretty low, and tinkering with the build system is fraught with peril. :) -- christopher tate android framework engineer ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
Re: [ccache] Stumbling blocks with ccache and embedded/encapsulated environments
On Wed, 2010-12-01 at 21:47 -0500, Paul Smith wrote: Now I'm on to my next problem. In order to get this to happen I have to set CCACHE_BASEDIR to strip off the workspace directory prefix, so that the per-workspace filenames are not embedded in the cache. This works (see above), however the result is not so nice. Ugh. I lied. Actually GDB handles this just fine with no special instruction; I had a problem on my test server (and then I misunderstood how the GDB substitute-path feature worked). This works because GDB remembers not only the path of the source file, but also the working directory when the compile happened: (gdb) info source Current source file is ../../../src/subdir/foo/foo.c Compilation directory is /path/to/ONE/obj/subdir/foo Source language is c. Compiled with DWARF 2 debugging format. Does not include preprocessor macro info. So GDB intelligently notices that the source file path is relative and appends it to the compilation directory, and viola! [1] I _think_ I'm all set now. I'll check back if more issues surface. Cheers! - [1] sic ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
Re: [ccache] Stumbling blocks with ccache and embedded/encapsulated environments
Even on a ccache *hit* both copies of the .o file wind up occupying buffer cache space, because the ccached .o is read from disk [paging it in] in order to write the .o file to the build output directory. On a ccache miss the copy runs the other direction but you still wind up with both sets of pages in the buffer cache. In the hit case I would have thought that the .o file you read would still create less memory pressure than the working memory of running the real compiler on that file? Perhaps the difference is that the kernel knows that when the compiler exits, its anonymous pages can be thrown away, whereas it doesn't know which .o file it ought to retain. So perhaps madvise might help. (Just speculating.) I'm curious about this. I guess you'd madvise to tell the kernel that the .o you just wrote shouldn't be cached? But presumably it should be, because you're going to link your program. Alternatively, you could madvise and tell the kernel not to cache the .o file from ccache's cache. But if you re-compile, you want ccache's cache to be in memory. I'm not sure how one might win here without hardlinking. -Justin On Thu, Dec 2, 2010 at 4:24 PM, Martin Pool m...@sourcefrog.net wrote: On 3 December 2010 03:42, Christopher Tate ct...@google.com wrote: I'd love to know whether you also tried distcc for it, and if so what happened or what went wrong. (Obviously it can only help for the C/C++ phases.) distcc can certainly help a great deal. For us, it's a bit problematic to use because more than half of our total build is non-C/C++ that depends on the C/C++ targets [e.g. Java-language modules that have partially native implementations], ... and you suspect that the Makefile dependencies are not solid enough to safely do a parallel build? plus we have a highly heterogeneous set of build machines: both Mac hosts and Linux, not all the same distro of Linux, etc. The inclusion of Macs in particular makes distcc more of a pain to get up and running cleanly. That can certainly be a problem. I'm just trying to understand how this happens. Is it that when ccache misses it writes out an object file both to the cache directory and into the build directory, and both will be in the buffer cache? So it's not so much they're paged in, but they are dirtied in memory and will still be held there. Even on a ccache *hit* both copies of the .o file wind up occupying buffer cache space, because the ccached .o is read from disk [paging it in] in order to write the .o file to the build output directory. On a ccache miss the copy runs the other direction but you still wind up with both sets of pages in the buffer cache. In the hit case I would have thought that the .o file you read would still create less memory pressure than the working memory of running the real compiler on that file? Perhaps the difference is that the kernel knows that when the compiler exits, its anonymous pages can be thrown away, whereas it doesn't know which .o file it ought to retain. So perhaps madvise might help. (Just speculating.) -- Martin ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache