Re: [ccache] Long pauses due to automatic cache clean-up

2021-04-27 Thread Joel Rosdahl via ccache
On Sun, 18 Apr 2021 at 10:04, R. Diez via ccache  wrote:
> First of all, I have been using ccache for years. Many thanks for this great 
> tool.

You're welcome.

> I have come to the conclusion that these particular pauses are due to ccache
> automatically pruning its cache.

I haven't really observed pauses due to cleanup myself since I have
used SSDs for
a long time, but yes, they will occur with a slow enough disk (or loaded system)
and many parallel ccache invocations.

> I was surprised that completely clearing the cache with option '--clear' can
> take a very long time, 12 minutes the last time around, because ccache
> probably does not need to calculate anything beforehand, just delete all
> files. But I haven't benchmarked the filesystem itself, so perhaps it's just
> Linux being slow deleting so many small files.

Yes, what you're seeing is a slow disk (or filesystem) in combination with many
files in the cache directory. You will get similar performance from "rm -r".
Upgrading to ccache 4.x will actually help since then only one file will be
stored per cache entry instead of up to seven files (but more likely two) with
ccache 3.x, so there will be fewer files to consider and remove when cleaning or
clearing.

> What happens if you are building in parallel, with "make -j32", and all 32
> concurrent instances decide to clean the cache at the same time? Will all of
> them attempt to delete the same files at the same time?

Yes, that could happen. A ccache invocation that writes the result of a
compilation to one of the 16 cache buckets will start cleaning up that bucket if
the cache size threshold has been reached. If another parallel compilation
finishes and writes to the same bucket before the first has finished cleaning
then it too will start cleaning and so on. There is no coordination between such
cleanups.

This is how the automatic cleanup has worked since ccache was initially created
in 2002. I have ideas on how this can be improved but I don't have much time and
interest to work on ccache these days so things are unfortunately slow. (I no
longer use ccache personally except for building ccache...)

> I wonder if there is a way to increase the number of buckets, from 16 to say
> to 1024. There is not reason to have so few of them.

There is a reason, and it's spelled "backward compatibility". Other minor
reasons also exist.

> I have also being thinking of triggering a manual cache clean-up on start-up
> [...]
> Is that a good strategy? Or are there better ways?

I think it's a good strategy given that you have a problem with automatic
cleanups. You could consider disabling automatic cleanup completely (set
max_size to 0) and run "CCACHE_MAXSIZE=20G ccache -c" or so each night or more
often. The limit will then only affect that ccache invocation; it won't be
stored in the config file. Also, you don't have to parse the ccache
configuration since the cron job has the definition of what the max cache size
should be.

> Is there some script able to parse the ccache config file that I could use as
> an example?

Not that I know of. You can query the size with "ccache -k max_size", but it
will include a size suffix that you need to parse if you want to make it
generic.

-- Joel

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] Long pauses due to automatic cache clean-up

2021-04-18 Thread R. Diez via ccache

Hi all:

First of all, I have been using ccache for years. Many thanks for this great 
tool.

I don't use ccache for overnight builds, because it does not really matter if the builds take longer then. I am using ccache when (re-)building 
interactively.


Normally, ccache is great: many rebuilds are instant. But sometimes, the build pauses for a long time, say 20 seconds. There is no CPU usage during 
this time.


I have had slowdowns because of unrelated local network problems, but in the meantime, I have come to the conclusion that these particular pauses are 
due to ccache automatically pruning its cache.


I would be nice if someone could confirm that they have seen such pauses on 
their computers.

My cache is not particularly big, but I am using a conventional (rotational) 
desktop hard disk. There are no syslog messages about any disk issues.

The ccache configuration is:
  max_size = 20G
  max_files = 10

I am using Ubuntu 20.04.2 on a standard ext4 filesystem. I only added flags 
"noatime,commit=30" to this filesystem in /etc/fstab .

The ccache version on Ubuntu is probably rather old: 3.7.7 vs newest seems to 
be 4.x, like 4.2.1 .

I was surprised that completely clearing the cache with option '--clear' can take a very long time, 12 minutes the last time around, because ccache 
probably does not need to calculate anything beforehand, just delete all files. But I haven't benchmarked the filesystem itself, so perhaps it's just 
Linux being slow deleting so many small files.


Apparently, I am not the first one to talk about low performance when cleaning 
the cache. For example, I found this statement in the mailing list:

"the current cleanups that can take over a half hour to run and hammer a hard drive 
mercilessly"
https://www.mail-archive.com/ccache@lists.samba.org/msg01411.html

It is not clear from the documentation how ccache manages automatic cache cleaning/pruning during a parallel build. What happens if you are building 
in parallel, with "make -j32", and all 32 concurrent instances decide to clean the cache at the same time? Will all of them attempt to delete the same 
files at the same time? I have seen that there are 16 buckets, but there is bound to be some collisions with a sufficiently large parallel factor.


I wonder if there is a way to increase the number of buckets, from 16 to say to 
1024. There is not reason to have so few of them.

I have also being thinking of triggering a manual cache clean-up on start-up, after logging on, while I try to wake up and find the coffee machine. I 
could set the clean targets a little lower than usual, say 10 % less than the usual max_size and max_files configuration settings, so that there is 
enough space left for the day. This way, no more automatic clean-up will probably happen during the day. I could always set up a cron job that runs 
more frequently. The idea is to prevent a cache pruning pause during interactive development.


Is that a good strategy? Or are there better ways?

Option limit_multiple is not taken into account for manual cleanup, and lowering it would probably not trigger a cleanup anyway, so I would have to 
parse the config file to find out the current values of max_size and max_files, reduce them by some factor like 10 %, and then run "ccache --cleanup" 
with those values, if that is possible.


I am worried that passing --max-size modifies the configuration file, so I may have to revert the configuration file afterwards. And hope that no 
normal compilations start during this time, depending on how long the coffee machine needs to warm up. Or maybe I could use a second configuration 
file for the same cache with CCACHE_CONFIGPATH, or some other such hack.


Is there some script able to parse the ccache config file that I could use as 
an example?

Thanks in advance,
  rdiez

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache