[ccache] effect upon ccache of changes to cache_dir_levels
What does ccache do to an existing cache tree if one increases or decreases the value of cache_dir_levels in ccache.conf? Are cache entries moved? Are directories added or deleted to accommodate the change? Scott Bennett, Comm. ASMELG, CFIAG ** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * ** * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * *-- Gov. John Hancock, New York Journal, 28 January 1790 * ** ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
Re: [ccache] why is limit_multiple ignored?
Joel Rosdahl <j...@rosdahl.net> wrote: > On 19 December 2017 at 02:16, Scott Bennett via ccache < > ccache@lists.samba.org> wrote: > Hi Joel, Sorry about the delay in responding. I've been off-line for about a week and a half and may be again shortly. > > I set "limit_multiple = 0.95" in ccache.conf and "max_size = 30.0G" > > in ccache.conf, but cleanups are triggered when space usage reaches 24 GB, > > which is the default of 0.8. Why is this happening with ccache 3.3.4? > > > > The ccache manual is not very good at describing what actually happens at > cleanup. I'll try to improve it. > > Here's how cleanup works: After a cache miss, ccache stores the object file > in (a subdirectory of) one of the 16 top level directories in the cache > (0-9, a-f). It then checks if that top level directory holds more than > max_cache_size/16 bytes (and similar for max_files). If yes, ccache removes > files from that top level directory until it contains at most > limit_multiple*max_cache_size/16 bytes. This means that if limit_multiple The design problem is that there is no centralized index maintained of cache entries' paths, their sizes, and their timestamps, necessitating the plumbing of the directory trees. This very time-consuming task should only be required when a ccache user determines that the cache is internally inconsistent somehow, e.g., by having one or more damaged entries, having erroneous statistics, or by being out of step with the index. It should not be part of an ordinary cache eviction procedure. A command to run a consistency check/repair should not do any cache evictions based upon space, which would be done by the next actual use of ccache anyway, but rather only if the files involved are part(s) of a damaged cache entry. The overhead of maintaining the index should be minor, especially when compared to the current cleanups that can take over a half hour to run and hammer a hard drive mercilessly. (A centralized index should also include the total space in use.) The lack of a centralized index can also result in cache evictions that are not actually LRU. The kludge of using 16 caches instead of a single, unified cache would be unnecessary with a centralized index as well. The index would be used to go directly to each file to be deleted without the need for a directory tree search. Cleanups ought to be much faster. Note that some sort of short-term lock would need to be used for updating the index, too, but the same is already true for the $CCACHE_DIR/[0-9a-f]/stats files. > is 0.8, the total cache size is expected to hover around 0.9*max_cache_size > when it has filled up. But due to the pseudo-randomness of the hash Where does the hysteresis of (0.9-0.8)max_size=0.1*max_size come from? > algorithm, the cache size can be closer to 0.8*max_cache_size or > 1.0*max_cache_size. > > The above should be true for any serial usage of ccache. However, ccache is > of course very often called in parallel, and then there is a race condition > since several ccache processes that have stored an object to the same top > level directory may start the cleanup process simultaneously. Since > performing cleanup in a large cache with a low limit_multiple can take a > lot of time, more ccache processes may start to perform cleanup of the same > directory. The race can lead to the final cache size being below > limit_multiple*max_cache_size, perhaps very much so. This is a known > problem. We have had some ideas to improve the admittedly naive cleanup > logic, but nothing has been done yet. That problem, at least, seems relatively straightforward to fix. First, only one cleanup need be done in such situations, so a lock should be tested and set by the first ccache process that decides a cleanup is necessary. All later comers should be delayed until that cleanup completes, but then those others should proceed without also doing cleanups. Their decisions in favor of a cleanup are out of date once the cleanup run completes, so they should just skip any cleanups themselves or at least retest the size of what they need to store plus the current cache size against max_size to make a fresh decision. > > Maybe the above described problem is why you get a 24 GB cache size? See discussion below. > > Or maybe you ran "ccache -c"? Unlike what the manual indicates, "ccache -c" No, it was automatically triggered. > will delete files until each top level directory holds at most > limit_multiple*max_size/16... > > why is limit_multiple ignored? > > > It isn't. Or don't you see a difference if you e.g. set it to 0.5? > I haven't tried that. The caches I have represent a lot of CPU time and elapsed time, especially given that I have compression turned on, so I
[ccache] why is limit_multiple ignored?
I set "limit_multiple = 0.95" in ccache.conf and "max_size = 30.0G" in ccache.conf, but cleanups are triggered when space usage reaches 24 GB, which is the default of 0.8. Why is this happening with ccache 3.3.4? Scott Bennett, Comm. ASMELG, CFIAG ** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * ** * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * *-- Gov. John Hancock, New York Journal, 28 January 1790 * ** ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache
Re: [ccache] useful test package to use with ccache at the outset in Gentoo.....
Michael Fothergill via ccache <ccache@lists.samba.org> wrote: > On 8 December 2017 at 01:35, Scott Bennett via ccache < > ccache@lists.samba.org> wrote: > > > Michael Fothergill via ccache <ccache@lists.samba.org> wrote: > > > > > I have an amd64 kaveri box with 8GB RAM and run Gentoo stable on it. > > > > > > I have just installed ccache with 2GB memory allocated to it. > > > > By that, I assume you have allocated some kind of memory-based device > > for the cache. Is that a correct understanding? > > > > ?Thanks for your reply and comments. I am assuming that by having the > standard command ? > > ?CCACHE_SIZE="4G" (I have increased the allocation) As ccache is installed on my system, I cannot find an environment variable of that name documented. Are you sure it is being used? What I do find documented are CCACHE_MAX_SIZE and the corresponding ccache.conf parameter max_size. If you're going to build large items like libreoffice or a LINUX world, you probably ought to have a much larger cache. I also found it a good idea to change limit_multiple from its default value of 0.8 to 0.95 to avoid half-hour-long cleanups in the middle of build runs. It is worth noting that, by my understanding of ccache the last time I dug into it a bit, ccache actually maintains 16 distinct caches, but the max_size and limit_multiple values apply to the total size and usage fraction of the aggregate of all 16 caches. When a cleanup occurs, ccache chooses one of the 16 caches and begins deleting the least recently used entries in it until the total space allocated has been reduced to the limit_multiple fraction of the max_size. If it runs out of things to delete in that cache while the total allocated remains above that fraction, ccache chooses another cache from which to begin deleting entries, and so on. This procedure differs from the one described in the ccache man page and is one reason I like to give more space to the cache(s) in order to prevent recent entries from disappearing from a cache while other, far less recently used entries remain in the other 15 caches. By having a max_size large enough to hold the last several iterations of frequently built items, cleanups are more likely to satisfy the limit_multiple by deleting the oldest few iterations of updates while sparing the more recently used entries in a cache. These days the disk space is cheap enough to give it 10 GB to 30 GB without creating any problems for me, so I just do that. Another trick to keep the caches useful is to allocate separate ones for different purposes. For example, I set up one for building the OS userland and kernel, another for building libreoffice, and a third for building everything else. Doing this keeps the OS and libreoffice from evicting everything else or each other prematurely. :-) > then memory from the hard drive is being used by default here - I was not > trying to use e.g. RAM memory. > Oh. Okay. If your system is used heavily for compiling software, you may see some performance gains from putting the cache area onto an SSD. Using a system memory-based cache is, of course, lightning fast, but the entire cache evaporates when the device is deallocated (e.g., during a system shutdown or failure). I tried using software five- and six-way RAID-0 devices for the file system containing my caches for a while, but decided their performance was poor. At present I'm using a software CONCAT made of two two-way software RAID-1's, all on the same kind of hard drives as the earlier setups, and this setup seems to do very nicely for now. I just have to remember not to run updates at the same time as scrubs on the six-way raidz2 that occupies the bulk of the same drives. :-) I only scrub that pool about every three to four weeks, though, so it usually isn't a problem. > > > > > I have tried some repeat compilations to see if there would be any speed > > > increase. > > > > > > So far I have not seen much change but I am not skilled enough to improve > > > things yet. > > > > Your statistics show that slightly more than 45% of your total > > compiler invocations (hits/(hits+misses)) were avoided. Did that not > > make a dent in your timings? > > > > > > I tried compiling gcc, glibc and imagemagick but did not see much > > > improvement. > > > > If you run the full build process for gcc, I would not expect > > to see much improvement because most of it involves the use of either > > a) a temporarily built compiler in a temporary location or b) the > > newly built compiler being used for testing, but not yet installed > > into the production location on your system. > > > > ?Would cachecc1 perform any better with gcc?? &
Re: [ccache] useful test package to use with ccache at the outset in Gentoo.....
Michael Fothergill via ccachewrote: > I have an amd64 kaveri box with 8GB RAM and run Gentoo stable on it. > > I have just installed ccache with 2GB memory allocated to it. By that, I assume you have allocated some kind of memory-based device for the cache. Is that a correct understanding? > > I have tried some repeat compilations to see if there would be any speed > increase. > > So far I have not seen much change but I am not skilled enough to improve > things yet. Your statistics show that slightly more than 45% of your total compiler invocations (hits/(hits+misses)) were avoided. Did that not make a dent in your timings? > > I tried compiling gcc, glibc and imagemagick but did not see much > improvement. If you run the full build process for gcc, I would not expect to see much improvement because most of it involves the use of either a) a temporarily built compiler in a temporary location or b) the newly built compiler being used for testing, but not yet installed into the production location on your system. ImageMagick and GraphicsMagick both should provide useful timings and ccache statistics. glibc probably would, too, though it's not nearly as big. I don't know what sort of build procedures Gentoo uses, but from the FreeBSD ports tree, here are some other good examples of test cases: math/octave, www/webkit-gtk2, www/webkit-gtk3, www/webkit2-gtk3, devel/llvm40. Be prepared to wait a long time for the first compilation of each of the webkits. They are big and slow to compile and, in the past, have shown instabilities in their build procedures when parallel make runs were used. YMMV on another OS. One big savings for me was in running "make buildworld" and "make buildkernel". buildworld, on my last machine, was taking about six hours elapsed time for a first run. When running it later after updating the source tree, the elapsed time was reduced by 2/3 to 3/4, depending upon the number and sizes of source modules affected by the updates. Note that ccache and some other things need a slightly different setup in order to build FreeBSD. Your OS may also need some special provision, so be sure to read the ccache installation instructions for Gentoo carefully. Scott Bennett, Comm. ASMELG, CFIAG ** * Internet: bennett at sdf.org *xor* bennett at freeshell.org * ** * "A well regulated and disciplined militia, is at all times a good * * objection to the introduction of that bane of all free governments * * -- a standing army." * *-- Gov. John Hancock, New York Journal, 28 January 1790 * ** ___ ccache mailing list ccache@lists.samba.org https://lists.samba.org/mailman/listinfo/ccache