Re: [ccache] Combining multiple ccache into one

2018-03-18 Thread Anders Björklund via ccache
Den 2018-03-14 kl. 19:31, skrev Basile Starynkevitch via ccache:
> 
> 
> On 03/14/2018 06:54 PM, Jason Zhou via ccache wrote:
>> Hi,
>>
>> I am looking for an efficient way to correctly combine multiple ccache from 
>> hundreds of build machines into a single ccache to build a super set ccache.
> 
> perhaps you should consider distcc https://github.com/distcc/distcc or 
> icecream https://github.com/icecc/icecream
> 

The use of distcc is orthogonal to the use of ccache,
it only applies in the case of a cache miss (compile)

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Combining multiple ccache into one

2018-03-18 Thread Anders Björklund via ccache
Jason Zhou wrote:
> I am looking for an efficient way to correctly combine multiple
> ccache from hundreds of build machines into a single ccache to build
> a super set ccache. We use 200+ autoscaled cloud machines in our
> build farm and each machine builds a random subsets of the source
> tree. ccache size on each machine is ~70GB and contains ~500K files.
> Having a superset ccache pre-built in the cloud image will greatly
> improve our build time.

Including a pre-populated cache in the OS image is a novel idea,
but I wonder if you would have to resort to that "workaround" ?

You could keep a local cache, and sync it from a "secondary cache".
We have some code for this, but none of it is up to sync with master.

> I noticed the same ccache filename (*.o, *.manifest, *.d) not
> necessarily has the same content (md5sum) on different machines and
> wonder if rsync is the right tool to do this, or is it feasible at
> all to combine ccache.

This is normal. The created files might have different timestamps
and such, that makes their checksum different. But they are supposed
to be interchangable, so none of those differences should *matter*
(if it does, then we are missing to hash something important...)

> I am trying to avoid ccache on NFS mount due to number of machines we
> are dealing with and performance of NFS is not promising. 

Have you tried out the memcached version ? It was developed for
that reason... You can have a cluster of such servers, if needed.

https://github.com/ccache/ccache/tree/dev/memcached

To further scale out, one can keep a local memcached proxy ("moxi")
and have the cluster be disk-backed (using couchbase) for restarts.

https://www.couchbase.com/memcached

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] ccache - .d files with absolute system header paths

2017-09-16 Thread Anders Björklund via ccache
Karanam Suryanarayana Rao Venkata Krishna wrote:
> Hello,
> I think I discovered a scenario that results in cache misses in spite of
> using CCACHE_BASEDIR.
> Consider the following command:
> 
> CCACHE_BASEDIR=$PWD /bin/bash -c "ccache clang++
> -fdebug-prefix-map=/proc/self/cwd= -g -c -MD -MF hello.d -o hello.o
> hello.cpp"
> 
> It seems to me that is is perfectly alright to ask for debug prefix mapping
> like: "-fdebug-prefix-map=/proc/self/cwd="
> Unfortunately, ccache is ending up hashing gnu_getcwd(); thus, even though
> we use CCACHE_BASEDIR setting, such a cache cannot be shared by other
> users' from different workspaces resulting in cache misses.
...
> 
> If the string after "=" in the mapping is null string, then, I hash "./".

Can't you just use "$PWD" and ".", instead of this elaborate scheme ?

Especially since using /proc/self/cwd doesn't even work, not with GCC...
i.e. when you give that prefix, it will just look for that path string

In the actual debug info, you will _still_ have a reference the cwd.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] base_dir and symbolic links

2017-09-16 Thread Anders Björklund via ccache
Andreas Wettstein wrote:
> In function `make_relative_path` (file `ccache.c`), the given
> path is made "canonic" before it is converted to a path relative
> to the current working directory.  In particular, "canonic" means
> that symbolic links are removed.  I understand that it makes
> sense to make the current working directory canonic, but I do not
> see why this done for the path given to `make_relative_path`.  Is
> it really necessary?
> 
> The reason why I am asking is that I have a usage scenario where
> this reduces sharing a cache among different users.

Seems like a pretty niche use case, but if you can come up with a
patch that doesn't break too much of current behaviour - why not.

You should still get preprocessor hits, but suppose you would
rather get "direct" hits instead of having to read both files.

Worst case, it could be an option (that defaulted to false) ?

Then you could override it at runtime, for your use case...

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Swift Support?

2017-05-28 Thread Anders Björklund via ccache
Jimmy Yue wrote:

> I was wondering if anybody in this list has tried support swift and maybe
> even ibtool. I'm looking to implement this myself if there's a gap here,
> but if anybody has tried it before and ran into problems I'd love to know.

I think the only thing Swift has in common with C++ are the slow compile
times, or else I'm missing something. Interface Builder is even further.

But maybe you mean that it could use a similar (memoization) approach ?

We had a similar question before, but I think that it went unanswered...
https://lists.samba.org/archive/ccache/2016q3/001465.html

I think it *could* probably be done, but it would look a lot different.


The first step would be to get a list of all the file dependencies.
There is no preprocessor, so no need to worry about (just modules)

Apparently there is a "swiftc -emit-dependencies" (output depends),
maybe the .d files could be parsed to generate something useful ?

Might be some similarities to https://github.com/ccache/ccache/pull/130


And it would probably make for a separate program/project, as well...

Even the Objective-C stuff (i.e. modules) is getting hard to support.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Theoretical question regarding ccache

2017-04-29 Thread Anders Björklund via ccache
Aashaka Shah wrote:

> I recently came across ccache as I have an assignment to think of a novel
> compiler design or related problem, and having a cache for compilers was
> the first thing that came to my mind. I thought of trying it out on
> QuantLib, a financial computation library.

Hi! It would interesting to have some more scientific research on this,
since ccache is a rather pragmatic project. It's "memoization", I think.

> I would like to know
> 
> 1. Why multiple source files cannot take advantage of ccache ( Why other
> types of compilations (multi-file compilation, linking, etc) will silently
> fall back to running the real compiler)

Well, depends on what you mean with "source" exactly. Basically when you
have one source (.c), it would still read a lot of headers (.h) etc...

But when you give the compiler more than one C/C++ source file, it will
compile them one by one anyway. Looks like your build system is broken ?


Do you have an example of the actual files you are trying to compile ?

It would help when trying to understand what you are trying to do here.


> 2. Where in the memory hierarchy(cache, main memory?) does the ccache
> output reside? Does it work according to the default replacement policies
> of the hardware cache?

The ccache files are stored on the filesystem, so in the "secondary" ?
Usually it's stored in RAM and written to disk, unless using a ramdisk.

Using the memcached implementation also offers you other mixed options.
https://en.wikipedia.org/wiki/Memcached (see the dev/memcached branch)

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] Testing on other platforms (than Linux)

2017-03-30 Thread Anders Björklund via ccache
Hi ccache!

We have earlier had some problems on testing on other *Unix* platforms,
like Solaris or FreeBSD. Mostly because none of us are running those...

https://github.com/ccache/ccache/issues/148

https://svnweb.freebsd.org/ports/head/devel/ccache/

We are also seeing some bugs that are "unique" to either Mac or Win,
but hard to reproduce with regular Clang or MinGW running on Linux.

https://github.com/ccache/ccache/issues/54
https://github.com/ccache/ccache/issues/156

https://github.com/ccache/ccache/issues/95
https://github.com/ccache/ccache/issues/122


Lately we had some discussion on this, with Joel explaining his view:
https://github.com/ccache/ccache/pull/162#issuecomment-289215585

We could need some more ideas, on how to make this work better ?
Meanwhile, I've updated Clang and MinGW so that they work again.

I also added a Docker build to define the "basic platform" and
plan on extending this with clang and mingw (like in Travis)

Also updated some more of the "clang-analyzer" tests, and to the
new 64-bit versions of MinGW-w64 and Wine64 (instead of 32-bit).


But what would be the best option to keep the base project both
clean and portable ? And how to make it easier to test/contribute.

I think the current infrastructure with Travis, and now Docker, are
a good baseline, maybe we could add some Mac or Win in the cloud ?

I know that we have access to Appveyor (we tried to use it before),
and that the new "MSYS2" is looking promising (although is very slow)

https://github.com/ccache/ccache/pull/69
http://www.msys2.org/ (uses Pacman, yeah!)


But are there any better options, or more people wanting to help out ?

My plan was to try to get the memcached and compression working instead.

https://github.com/ccache/ccache/pull/58
(+ https://github.com/ccache/ccache/pull/104)

https://github.com/ccache/ccache/pull/118
(+ https://github.com/ccache/ccache/pull/81)

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Visual C/C++ compiler upgrade

2017-03-13 Thread Anders Björklund via ccache
Jean-Dominique GASCUEL wrote:
> Dear ccache developers,
>
> I just started a fork to try to make ccache compatible with Visual C/C++
> compiler (cl), so one can use it with msbuild or nmake based projects...
> You can review the current state here: https://github.com/jd-gascuel/ccache
>
> Current state:
>   - It starts to do something interesting. But more work is needed to
> handle specific options (.sbr files, debug options, etc.)
>   - I found the Travis stuff very interesting. But because Travis do not
> support windows, I am trying to make an AppVeyor similar stuff.
>  - There is probably issue about Visual vs. MinGW modes ... I did not
> start to investigate that.
>
> Any advices ?

We talked a bit about the Windows version and Appveyor,
in https://github.com/ccache/ccache/issues/122
and https://github.com/ccache/ccache/pull/69
You might want to look into those, even if about MinGW...

I suppose you have already seen other "inspired" work,
like https://github.com/frerich/clcache
or https://github.com/inorton/cclash
Even though those are in different languages (Python/C#)


The lack of autotools (i.e. no decent shell available)
is going to be something of a problem, in the long run.
The same goes for the test suite, that is also going to
need some thinking if it should work on "real" Windows.

Supposedly one could use something like CMake for this,
but there is a risk of making things harder on Unix...
Some other things can make you cringe, like the use of
/options or \\directories, but goes with cross-platform.


Think the ultimate question will be up to Joel to decide:

"Are you happy to have a windows branch in the official ccache repo ? 
and to merge it to the default trunk once it works great ?"


https://github.com/ccache/ccache/pull/162

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] cacche 3.2 or higher for Ubuntu 14.0

2016-10-17 Thread Anders Björklund
Mats Nilsson wrote:
> I'm a new-bee on ccache and I run clang on Ubuntu 14.04/64bit.
>
> Where can I find an apt-get package with ccache 3.2 or higher?

I think you have to request a backport, or do a PPA repository:

https://wiki.ubuntu.com/UbuntuBackports
https://help.launchpad.net/Packaging/PPA

The current versions are:
precise (12.04LTS) (devel)  3.1.6-1
trusty (14.04LTS) (devel)   3.1.9-1
xenial (16.04LTS) (devel)   3.2.4-1

Or you could perhaps just build it yourself, from the source ?

https://ccache.samba.org/download.html

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] How to omit caching of a specific compilation

2016-10-17 Thread Anders Björklund
Mats Nilsson wrote:
> Thank you very much!
>> Is there a way to instruct ccache to not cache the object file from a 
>> specific compilation?
>>
>  Look for the following configuration file options in the ccache(1) man
> page for disable, read_only, and read_only_direct to decide which one you
> need for your situation.

It would also be nice to know why it had to be disabled, if there is a 
bug in ccache. Normally caching a compilation shouldn't hurt anything...

Now, certain compilations (e.g. timestamps) one might to avoid compiling
just to not fill the cache with junk. But normally that doesn't happen*.

/Anders


* i.e. it's a bad idea anyway, for other reasons (reproducability etc)

See https://reproducible-builds.org/ for more on the particular subject.
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] NFS cache

2016-10-17 Thread Anders Björklund
Mats Nilsson wrote:
> I'm experimenting with having a common NFS-disk as cacche for all build 
> agents in TeamCity.

It works, but is kinda slow and full of issues (like cleaning, for 
instance). Using NFS only as secondary cache works (somewhat) better.

I wanted to focus on memcached, but people keep beating that poor old
dead horse. So maybe the "external" cache should be brought back... ?

It was also requested in: https://github.com/ccache/ccache/pull/139

Old code was at: https://github.com/afbjorklund/ccache/tree/external

> Should there be any form of semaphore/lock to support simultaneous access of 
> the cache?

There is a lockfile. That's the only way on NFS, since all forms of 
flock/fcntl are broken (unfortunately). See lockfile.c for the code.

You will also want avoid the cache ever getting full, since that will
trigger a delete storm from all clients (at once). Including stat's.

Some more discussion in: https://github.com/ccache/ccache/issues/124

Not sure about all this NFS love. Better the devil you know, perhaps ?

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Multiple directories for CCACHE_DIR?

2016-05-17 Thread Anders Björklund
Joel Rosdahl wrote:
>> If not, would be a good idea to implement that?
>
> I think that it could be interesting to investigate, but there are lots of
> things to think about before implementing it. For instance:
>
> What should happen on a cache miss? Should ccache store the result in all
> directories? That could potentially be quite slow. Or should it only store
> the result in the first directory? If so, when and how do the other
> directories get populated?

The way it happens with memcached, is that the writing is offloaded to a
background daemon. So it is handled locally, and "transparently" remote.

There was also an option to not store anything in the cache, only read ?
This is useful when you have for instance a CI build server populating.

> How should configuration file settings work? Should configuration be read
> only from one cache directory or from all?

The way that I looked at it, there was only configuration in the local.
Basically we still have most configuration in environment variables...

> Cache size configuration probably would need to be configured per cache
> directory. Does this imply that CCACHE_MAXSIZE variable no longer should
> override configuration file settings?

There was *no* cleaning of the remote. We had some "incidents", when
all clients started to inventory and clean the shared cache - at once!

The server would do it's own cleaning (using something like cron ?),
or just drop old records - this is what memcached does for instance.

> Which statistics counters would make sense to have in all cache directories
> (cache hit/miss, etc.) and which would not make sense (called for link,
> compile failed, etc.)?

Currently memcached keep its own statistics. Simplistic NFS/HTTP stores
would probably have to implement something similar, for hit and miss ?

I'm not sure that the "external" store (as I called them) would need
to use the same list of statistics as the regular "internal" store.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Multiple directories for CCACHE_DIR?

2016-05-17 Thread Anders Björklund
Wilson Hong wrote:

> Hi ccache, I am trying to improve ccache build performance by
> improving cache hit rate. One thing I wanna try is to have multiple
> folders specified in CCACHE_DIR. Where first level is local cache,
> and second level in LAN NFS, third level on AWS s3. Ccache will start
> looking for local ccache first, if not found, then search in NFS
> folder and then S3, analogy CPU L1-L2-RAM cache architecture. I take
> a look ccache man page: https://ccache.samba.org/manual.html, but
> still cannot figure out how to do that. Is that supported in ccache?
> If not, would be a good idea to implement that? Any advice is
> welcome. Thanks!

It is not supported out-of-the-box, but it makes perfect sense.

You can put your primary cache directory on NFS, this is described
in the manual (but then *everything* will go out over the network).

When doing support for what would become the "dev/memcached" branch,
we used something called an external cache that does what you want.
It designates a second (or more) directory, where ccache would also
look for cache hits. The implementation of it has varied a bit...

First we would introduce a new kind of hits, like a "half hit".
But that was too much hassle, so we just copied the external file
to the local cache and called that a "hit" too (somewhat slower).
Eventually one would probably want to have separate statistics ?

Now with the memcached support, it could be that one would want to
make the "external" support more generic - perhaps also include S3.


The newer version (for 3.2) is available from here:
https://github.com/itensionanders/ccache/tree/external

It allows for ne "external" directory, using the regular file layout.
The older 3.1 version would allow comma-separated list of directories.

This version has the new option refactored into what *could* become a
storage backend, that would allow for other SQL and NoSQL variants...


I'm not sure what library is best for talking ("directly") with S3,
but it shouldn't be impossible to adapt - it's a simple 4 step API.

The theory is that NFS and HTTP (or SQL) will "lose" compared with
memcached, but it would nice to have some more solid statistics ;-)

The memcached branch (for master) is available here:
https://github.com/jrosdahl/ccache/tree/dev/memcached


There are alternative backends for MySQL and for Couchbase (NoSQL),
but those have not yet been refactored... So only the FS, for now.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Buffer size for IO operations is too small

2016-04-03 Thread Anders Björklund
Anders Björklund wrote:
> Michael Kolomeytsev wrote:
>> I've discovered that there is too small buffer size for IO in ccache: 16k
>> or 10k
>> (in hash_fd, copy_fd, copy_file).
>>
>
> But your observations are very interesting, and please post
> more if you have it. Would also be nice to have some follow-up
> on the observation about ccache problems with multiple cores:
> https://github.com/jrosdahl/ccache/issues/54 (also on OS X)
>
> I'm thinking that hash and copy could do with different macros...

Actually three macros, hash, compress/decompress and plain old copy.
Thought I'd move the "copy" case aside, away from the other buffers...


You'd think that copying a file would be a simple thing to do, right ?
Actually, on some systems like Windows or Mac OS X it is. But on Linux:

Found this interesting blog post, that came with some benchmarks too:
http://blog.plenz.com/2014-04/so-you-want-to-write-to-a-file-real-fast.html

So the first thing to do would be to make the I/O buffer size into a
whole multiple of the block size, that is: 16384 instead of 10240.
Avoids having to do partial page copies later. And then allocating the
buffer in kernel space instead of user space sounded like a good idea.

But having to look for various OS/kernel versions of sendfile()? Eww.
Might as well stick with "splice()", since other main systems like
have solutions already: Win32 have CopyFile and OS X has copyfile.
And doing some "advise/allocate" sounded easy, but had pitfalls too.

Here is the end result, in case anyone is interested in a preview:
https://github.com/jrosdahl/ccache/compare/master...itensionanders:uncompressed


It sounded like a good idea, but needs some actual benchmarks to see
whether it was actually worth it. Probably should check st_blksize too.

The actual I/O can probably be made twice as fast (e.g. for a 1M file)
Question is whether it makes any real impact of the ccache run time ?

pipe+splice + advices + trunc   1175ns  1283ns  1290ns
read+write 4bs  1537ns  2126ns  2210ns  (+ 30.8%)
read+write 10k  2334ns  2356ns  2668ns  (+ 98.6%)
read+write bs   2515ns  2692ns  4591ns  (+ 114.0%)

But 256K seemed like overkill (over 16K), at least for plain copy I/O.
Might still be some additional benefits when doing gzip or md4, though.

/Anders


PS. We gave up on mmap already, for other reasons (high maintenance)
https://github.com/jrosdahl/ccache/commit/c358e7c801e265ce07e909d75f3f3fd4e16c7f65
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Buffer size for IO operations is too small

2016-04-03 Thread Anders Björklund
Michael Kolomeytsev wrote:
> I've discovered that there is too small buffer size for IO in ccache: 16k
> or 10k
> (in hash_fd, copy_fd, copy_file).
>
> I did simple fix and run several tests (on mac osx) trying to recompile
> chromium.
> (Of course there was 100% cache hit).
> Results (pay your attention to sys time):

It would probably be interesting to make this configurable,
especially when actually doing compression and checksumming
(and not just copying). We could also do with some updated
and more formalized benchmarks, so that we can all compare ?

But your observations are very interesting, and please post
more if you have it. Would also be nice to have some follow-up
on the observation about ccache problems with multiple cores:
https://github.com/jrosdahl/ccache/issues/54 (also on OS X)

I'm thinking that hash and copy could do with different macros...

Also wondering if we should move from stack to heap allocation ?
Or if we should do smaller buffers for smaller files, perhaps.

Played around with some different md4 and different compressions.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] explanation of ccache stats

2016-03-18 Thread Anders Björklund
Stéphane Charette wrote:

>> I created a new web page with some example graphs from the ccache plugins
>> in case anyone wants to see what it looks like:
>> https://www.ccoderun.ca/ccache-munin/
>
> ...and I should have mentioned, if anyone has more ideas on additional
> graphs I can extract from ccache information, let me know and I'll create
> the necessary plugins.

Hmm, I looked at the code a bit and I don't think this will fly:

find /home -maxdepth 3 -type f -name ccache.conf | sort

In an enterprise setting, this will cause a lot of thrashing
only to come up with nothing since ccache isn't stored there.
One would probably have to specify the $CCACHE_DIR in config
(and it is possible to share one ccache between several users)


This is reasonable, but misses those things not being reported:

ccache_total=$((ccache_hit + ccache_miss))

That is, if ccache fails for some other reason you won't see it.
Maybe those _are_ uninteresting, but it could be nice to know ?
If reporting the total too, then the "unknowns" can be plotted.
But reporting _all_ the failure reasons makes too many counters.


Otherwise, it is looking good! Especially like the "effectiveness".
Not sure if 1-min or 5-min is best, guess it depends on volume...

Should clean ours up and release it too. It is written in Python.
Maybe one could read stats, instead of forking `ccache -s` ? Nah.

i.e. libccache or something

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Bug with clang

2016-03-14 Thread Anders Björklund
Éric Chamberland wrote:
>>> How can I retrieve the documentation warnings with ccache?  Do I do
>>> something wrong?
>> Hmm, since you are using CCACHE_CPP2=1 you *should* be getting
>> source-level warnings too. However, if you cached the previous
>> result without using that setting (CCACHE_CPP2) you could still
>> get the cached results, i.e. the one without the extra warnings.
>>
>> But it is strange because CCACHE_CPP2 should be in the hash now.
>> Could be a bug with direct mode perhaps, that it never runs cpp.
>> It only looks at the (unchanged) files, and returns the cache...
>> If you clear the cache (or CCACHE_NODIRECT=1), does it remain ?
> ok, you are right.  I did many tries before writing the mail... and
> before finding your below mentioned web page, I tried without
> CCACHE_CPP2...
>
> Now, if I take care of doing a ccache -C before each configuration
> change, I can observe the warnings with ccache... So it was in fact
> cached when I tried the test and gave you the output...
>
> However, as pointed in my previous mail, the comments are not into the
> hash, then if I compile foo.cc with ccache and get the 3 warnings, then
> modify the documentation *without* changing the number of lines (ie:
> just change "\param" by "\note"), then the output is taken from the
> previous cached compilation... :/

Yeah, obviously I didn't think that one through completely :-)

For the hash to work (or rather: not work), then the preprocessor
needs to keep the comments in. Normally those don't affect the
object file, and discarding them is a feature - you get cache hits
when you only change a comment that doesn't affect the .o output.

But if you are using something like Doxygen, you do get output. Later.

And it does of course affect the .stderr output, that you were after.

>> So maybe only keeping comments is not enough, full source is needed ?
> For us, we compile with -Weverything, which is a lot more severe into
> the analysis.  We have to silent many warnings with "-Wno-*" options,
> but it is our burden to keep the code compiling "correctly", meaning
> absolutely no warnings.

Seems like both could be useful... (available as separate options)

> We have recently moved to use clang to check for documentation warnings
> too, to keep us from inserting new "bad" documentation... but we are
> used to compile with ccache+icecream or ccache alone (in our nightly
> tests and continuous integration) and don't want to work without
> ccache... it is unthinkable :)
>
> So, for me, the "keep comments" option, automated or not, is something I
> would try for sure, even if I have to compile a specific branch/sha of
> ccache... :)

I added the option, at: https://github.com/jrosdahl/ccache/pull/74

>> And maybe how to use ccache with clang should be better documented ?
> Maybe... can't tell... I found your page as soon as I googled for "clang
> ccache"... :)

Actually it should work out-of-the-box now, including the coloring.
It's also part of the test scope now, on both OS X and Linux (ELF).
Hopefully that means it'll continue to work, thanks to Travis CI.
I meant in the ccache manual, but who reads those things anyway :-)

>> It was only recently that the clang tests were fixed (396df7e), so it
>> seems like most developers are using gcc - and thus that is assumed.
>> I think we will see more things like this with GCC 5 (and later) too,
>> so some extra lines about source-level warnings are probably needed...
> We use also gcc and ICC, but clang is there to help us write better
> code... and documentation now... almost... :)
>> For now, the best workaround here is the "run_second_cpp" config.
> Done! :)

And you now (soon) have a "keep_comments_cpp" config to complement it!

Now, wonder if ccache actually compiles with -Weverything enabled ?
Nope - that seems to take a lot of effort. Only 187 errors remaining.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Shared ccache directory between docker containers

2016-02-08 Thread Anders Björklund
Ragnar Rova wrote:
> Thanks, i'll try it out. How stable is the memcached support?
>
>> It would also be interesting to test sharing a cache between containers
>> on different hosts, by using the "memcached" feature over the network:
>>
>> https://lists.samba.org/archive/ccache/2016q1/001394.html

It is ready for public testing, all known issues addressed:
https://github.com/jrosdahl/ccache/pull/58

An earlier version of it has been in production for years:
https://github.com/jrosdahl/ccache/pull/30


By default it will still use the file cache, just like before.
You have the option to only use memcached for storing objects.

We are looking for some more testers and for other feedback...
And there's a few minor tweaks being done, but nothing major.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Optimizing MD4

2015-12-11 Thread Anders Björklund
Andrew Stubbs wrote:
>> I would be interested in your thoughts on how to speed that part up.
>
> My implementation, which does a bunch of other things besides, hence why
> it's not fit to post[*], launches a background task which creates a unix
> domain socket in the cache directory (the windows version uses plain old
> TCP).
>
> Each invocation of ccache then connects to that socket and asks the
> daemon to do the MD4 scan on its behalf. The daemon checks the mtime on
> the file and serves the MD4 from its memory cache if nothing has
> changed. The stat call could probably be optimized away if the cache is
> very fresh (<1s?)

So basically something similar to the "sloppiness" file_stat_matches ?
Compare size/mtime/ctime, rather than rehashing the content of a file.

> In theory, what you get is ccache spending less time in MD4, but more
> time in I/O wait. It does seem to be faster, over all, but that might
> depend on your hardware.
>
> However, even if the latency of each ccache invocation is the same, the
> fact that they're basically idle means you can usefully crank up the
> parallelism for all but the initial build.
>
> You could, in principle, use this communication to limit how many
> cache-miss compilations are permitted to run in parallel, and therefore
> run "make -j" for maximum parallelism without fear of melting your memory.

Something like that is what I meant with a new "prefix" wrapper for cpp.
Similar to the current wrapper for cc, which does that with e.g. distcc.

> Unfortunately, I've moved on to other projects and don't have much time
> to work on this stuff any more.

Thank you for your ideas. Will check the code out too if I get the time.
It seems that there are some opportunities left for faster manifest/cpp.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Using ccache with memcached

2015-12-06 Thread Anders Björklund
Pierre Tardy wrote:

>> Here is such an attempt, to keep *both* features available:
>> https://github.com/itensionanders/ccache/tree/memcached-only
>
> I like it very much. I think it adds great value for ccache, and to my
> old memcached-only attempt.

Yeah, if it doesn't bloat the code base too much it makes sense to
leave the final decision up to the user with runtime configuration.

> I did not realize the use for moxi also as a connection "keep-alive"
> mechanism, and a way to hide the syn-ack latency overhead. I think this
> is what you mean by "avoid some of the network overhead." .This perhaps
> would deserve a little bit more details in the doc.

Right, that is what I meant. Think it depends a bit on the number
of servers involved, but I don't think it can hurt much either way.
Suppose another paragraph or two couldn't hurt, but the more advanced
config can remain with memcached and moxi documentation - I think ?

However, it does make a lot of sense to offer the "complete package"
and is something that we are looking into. Software and configuration.
For us that would entail ccache*, moxi, distcc, memcached and distccd.
So it spans at least three or four different open source projects.

* including zlib and libmemcached

> Even if it is not ready, I think it would be worth to create a pull
> request, and make it easier for everybody to review the current code.
>
> This is what I used to do it, but its not easy to put and track review
> comments.
> https://github.com/jrosdahl/ccache/compare/master...itensionanders:memcached-only

Yes, that works for testing. You can append a .diff or a .patch to it,
and use "diff" or "git am". But that's more read-only, and not social.

I wanted to do some more squashing and rebasing to "master" - but I 
suppose there is no reason why all that couldn't be done as a PR...

> Mike already put a bunch of coding style review comments on my own
> commits. I would rather not fix them myself, as I know you already have
> an evolved version which is more suitable for merging.

I think some of these might also have been fixed by "uncrustify" ?

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Using ccache with memcached

2015-12-06 Thread Anders Björklund
Den 2015-12-07 kl. 04:48, skrev Mike Frysinger:
> On 02 Dec 2015 20:16, Pierre Tardy wrote:
>>> i don't think getting rid out of the fs makes sense, but having memcache
>>> be available dynamically as an additional layer sounds fine..
>>
>> It does make a lot of sense for me as I have a high performance network,
>> which is faster than local harddrive. So I would insist on keeping an
>> option for memcached only.
>
> that isn't what i meant.  i don't care about runtime config options but
> about (1) the code and (2) build time control.  fs should remain in the
> source and memcache should be an additional configure flag which allows
> the user to select it at runtime.

That is the way that it currently works.

There is now a --enable-memcached flag, to avoid libmemcached being a
mandatory requirement. In the code, it uses a #ifdef HAVE_LIBMEMCACHED
But it still doesn't really *do* anything, unless you also set the
configuration for memcached_conf (containing for instance --SERVER).

Then there is a *second* boolean option, now called memcached_only,
that only uses the regular cache for manifests and for stats / conf.
So if that is set, it will avoid storing objects and friends in the
file system cache but only store those in the memcached servers...


Note that the use of binary packages (rather than using source ports)
usually ends up just picking one of the options for you anyway ?
So in that sense it's "better" to have it selectable at runtime, and
for the feature to be there (by default). Otherwise it is "gone".

Squashed everything, PR coming shortly.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Using ccache with memcached

2015-12-04 Thread Anders Björklund
> The option to switch the to_cache/from_cache can be made available
> separately, like it was in your PR. But it can use another config ?
> Probably needs some updating and refactoring, and it would be nice
> to try and keep the code duplication between them to a minimum...
>
> i.e. between the current filesystem code and the memcached code
>
> I can make an attempt to merge them, or if you want to do it...
> To add a config like "memcached_only", next to "memcached_conf" ?
> If you have a single server, then *neither* option makes any sense.
> So it all depends on the setup, and needs to benchmarked further...

Here is such an attempt, to keep *both* features available:
https://github.com/itensionanders/ccache/tree/memcached-only

If you set memcached_only to true, it will not use the fs cache.
Currently it will store the manifests locally, as in the original.

Also added a basic unit test to it.

Needs some cleanup, but works OK ?

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Using ccache with memcached

2015-12-02 Thread Anders Björklund
Pierre Tardy wrote:
>
> i don't think getting rid out of the fs makes sense, but having memcache
> be available dynamically as an additional layer sounds fine.
>
> It does make a lot of sense for me as I have a high performance network,
> which is faster than local harddrive. So I would insist on keeping an
> option for memcached only.

Both features could be kept.

The "memccached" layer is basically the same (I extended it a bit,
and changed the API a little...) and so is your memcached format.
I suppose you could use our memcached with just a small disk cache,
but you'd get a lot of (unnecessary) writes and cache cleanups ?

IIRC, your manifests (and headers) were still using the local drive.

The option to switch the to_cache/from_cache can be made available
separately, like it was in your PR. But it can use another config ?
Probably needs some updating and refactoring, and it would be nice
to try and keep the code duplication between them to a minimum...

i.e. between the current filesystem code and the memcached code

I can make an attempt to merge them, or if you want to do it...
To add a config like "memcached_only", next to "memcached_conf" ?
If you have a single server, then *neither* option makes any sense.
So it all depends on the setup, and needs to benchmarked further...


I had also added a "readonly" flag, for builds with lots of misses.

It's common to have a shared set of base files, and then with some
local alterations to those... So you would have a "blessed" build
filling up the memcached, and then individual builds based on that
would reuse objects if available but not add their one-offs to it.

The variant shared on NFS would just use CCACHE_READONLY for this.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Implementing a Read-only HTTP CCACHE_DIR(resurrect)

2015-12-01 Thread Anders Björklund
vkr wrote:
> Hello,
> I stumbled across this thread -
> https://lists.samba.org/archive/ccache/2012q2/000879.html which is years
> old,
> Coincidentally, I did some work along similar lines already, without
> realizing there was this discussion about this topic here,
> and I appreciate some comments/suggestions on my approach so far.

This is interesting, there was some renewed interest in the memcached
patch that was proposed in the same timeabout frame (i.e. in 2013)

https://github.com/jrosdahl/ccache/pull/30
https://lists.samba.org/archive/ccache/2013q3/001124.html

> Having cache on NFS is comparatively the easy option from configuration
> point of view, however, there can be environments where
> for whatever the reasons, NFS server is a few hops away, while there are
> other machines that are closer to the build farm, in which case,
> having a HTTP CCACHE_DIR does seemed like a reasonably better option as it
> involves less configuration havoc on every machine in the build farm.

It also has lots of problems with for instance locking (workaround
is included) and overhead when updating modification timestamps etc.

> Keeping the above as use case, I've implemented HTTP CCACHE_DIR in my fork
> - https://github.com/venkrao/ccache
> This is a very crude throw-away test from a beginner C Programmer, that
> does the following. Care has been taken to ensure it does behave like
> existing ccache to the extent I know so far, and I did have successful runs
> of modified ccache with no core/crash or surprise failures.

I haven't been able to test your code, but it does sound like there
are some shortcomings in the design (e.g. like it being read-only).

The repository has some issues, in that it has been disconnected ?
It also has a bunch of generated files, being imported from tarball.

> Unfortunately, in our existing environment I could not see visible
> improvement between our NFS based cache setup and this new approach.
> I cannot attribute the lack of performance to anything right now.

It could be inherent with HTTP, just like it was in NFS before ?
Using a local filesystem cache or a shared memcached seems better...

I gave up on _my_ http version, when I found the memcached version.
Will post some more details about my own version of it separately.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] Using ccache with memcached

2015-12-01 Thread Anders Björklund
Hi all!

While the idea of using memcached with ccache is nothing new (*),
it seems to be more popular now with more memory being available...

* https://lists.samba.org/archive/ccache/2010q4/000686.html
   https://lists.samba.org/archive/ccache/2013q2/001120.html

Pierre Tardy made a PR (https://github.com/jrosdahl/ccache/pull/30)
to replace the filesystem ("fs") cache with memcached altogether.


We have gone with a different approach, to use memcached only as a
secondary cache - while preserving the primary cache (on the disk).

Also added support for big files larger than memcached default (1M),
without having to modify the servers - by splitting them up if needed.


Manifests are just stored in a single entry in the memcached, while
other files are being combined into one entry per cache key (md4-len)

The idea is that hitting this secondary cache is still cheaper than
doing a compile again, but could be slower than not using the network.

The overhead of having each and every ccache invocation call memcached,
can be avoided by setting up a local memcached proxy ("moxi") server.


There is a public branch available, rebased from a 3.1 version:
https://github.com/itensionanders/ccache/tree/memcached

It has been (recently) updated to "3.2-maint", but not to "master".
Being a work in progress still, it's not ready for merging just yet.

But I would like some early feedback, and perhaps some more testing ?
More benchmarking needs to be done, and for some different scenarios.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Caching failed compilations

2015-07-07 Thread Anders Björklund
Hi Joel and all!

I also found the idea of storing failures interesting, and made a quick sample 
implementation earlier:

https://github.com/itension/ccache/compare/store_failures

Feature is enabled by setting $CCACHE_STOREFAILURES

It does store the status as a separate file, but on the other hand there is no 
object file stored for failures.
One could look for a object file (success) *before* looking for a status file 
(failure), to cut down on stat's ?

My biggest fear is that it will store I/O errors and whatnot, with no easy 
way to rebuild (needs a recache)
So I made it opt-in, rather than default. So far, it seems that almost 
everything in cc returns exit code 1


I was investigating cutting down on the number of files, and stored everything 
in a LMDB* database instead...
The biggest downside is that adding new files to the cache (and cleaning) now 
becomes much more involved.

* https://en.wikipedia.org/wiki/Lightning_Memory-Mapped_Database

We already had a compact format developed for the memcache extension* (that is 
also proving to be very useful).
It does make extending the format harder, but I suppose one could use the 
version in the CCH1 header for that ?

* 
https://github.com/tardyp/ccache/commit/33852da77f54c9227cb90e013e1bb186a7d315c2


I am hesitant to replace the default (files), but see great potential when 
combining memcached with distcc.
It opens up for sharing a secondary cache between *several* servers, but 
without having to do a recompile.

So we use it like: ccache - memcached - distcc

Will clean up database/memcached for public testing...

It is possible to convert between the different cache formats, since the actual 
files inside are all the same.
A simple conversion script (in python) is provided. A little slow if the cache 
is in use, but otherwise it's OK.

/Anders


Från: ccache-boun...@lists.samba.org [ccache-boun...@lists.samba.org] f#246;r 
Andrew Stubbs [a...@codesourcery.com]
Skickat: den 7 juli 2015 10:58
Till: Joel Rosdahl
Kopia: Akim Demaille; ccache list
Ämne: Re: [ccache] Caching failed compilations

On 06/07/15 21:44, Joel Rosdahl wrote:
 That sounds like a reasonable idea, but I have occasionally seen empty
 object files in large and busy caches (it could be due to filesystem
 failure, hardware failure or hard system reset), so I'm afraid that
 using zero-length object files won't work out in practice. See also
 https://bugzilla.samba.org/show_bug.cgi?id=9972. But maybe writing some
 special content to the object file would be OK?

OK, fair enough, but I'd say that once you've opened the file and
checked the magic data then you've already killed performance. How about
a magic length that can be observed in the stat data?

A failure can be confirmed by a read, if and only if the length matches,
but a compile success will remain on the quick path.

A cache-hit for a compile failure need not be the *most* efficient code
path; it will likely end the build process. As long as it's faster than
the slow compile failures the OP cares about then all is well.

 Sorry, I don't see any advantage in this scheme. You might save a
 few bytes of disk space, and maybe a few inodes, but I've not seen
 any evidence that those are a problem. You'll also add extra file
 copies to every cache miss, and those are already expensive enough.


 My primary motivation for considering the mentioned scheme is to reduce
 disk seeks, not disk space. If you have a cold disk cache (on a rotating
 device), every new i-node that needs to be visited potentially/likely
 needs a new disk seek, which is slow. If all parts of the result are
 stored in one contiguous file, it should likely be quicker to retrieve.
 But as mentioned earlier, I have no data to back up this theory yet.

My understanding is that when a disk read occurs the kernel reads the
entire page into the memory cache. Subsequent inode reads will likely
hit that cache, so reading two inodes is nearly as cheep as reading one.
The system call overhead is constant, however.

 A secondary motivation for the scheme is that various code paths in
 ccache need to handle multiple files for a single result. There can now
 be between two (stderr, object) and six (stderr, object, dependency,
 coverage, diagnostics, split dwarf) files for each cached result. If one
 of those files is missing, then the result should be invalid. This is
 quite painful and there are most likely some lurking bugs related to this.

OK, that's quite a lot of files. Hopefully it does not look for a file
unless it really ought to be there? I worry that you'll hurt the common
case (just two files) in order to help the uncommon case, and that that
is already about as good as it can be (especially with hard-links).

 A third motivation is that it would be easier to include a check sum of
 the cached data to detect corruption so that ccache won't repeatedly
 deliver a bad object file (due to hardware error or 

Re: [ccache] [PATCH] Add support for coverage (compiling for gcov)

2015-03-31 Thread Anders Björklund

 (Sorry about the very delayed answer.)

No problem!

 The patch looks good! I plan to only apply fixes to serious bugs on 
 3.1-maint, so I'll focus on the 3.2-maint version.

That it is OK, we can backport to 3.1 downstream if needed. Will focus on 
master (and 3.2).

Here is a delayed update, with updated versions for 3.1-maint and 3.2-maint:

https://github.com/itension/ccache/compare/3.1-maint...coverage-3.1-maint
https://github.com/itension/ccache/compare/3.2-maint...coverage-3.2-maint

After this change, it should be able to return cache hits for --coverage as 
well.

 1. I get this test suite failure with GCC = 4.7:
 No failure with GCC = 4.6.

 I guess that the coverage (empty) test should check that the two runs 
 either both produce no test.gcno files or both produce identical test.gcno 
 files?

Right, I will need to look into this issue with a newer compiler and get back 
to you.

There's also some issues with absolute/relative paths and cache hits from when 
using a base directory...

Turns out that it doesn't really matter if the file is empty, or just small (12 
bytes).
As long as the code handles the missing case for older compilers, it'll be 
fine...

Also found the issue with lcov and relative paths (with basedir) and the cache 
hits.
It needs to record the full input file path for coverage, and not make it 
relative...

 2. You wrote Please include these in ccache, under GNU General Public 
 License v3. Just to clarify: Do you agree to use the same license as the 
 rest of ccache does, which is GPLv3 or any later version?

Yes, I meant to use the same license as the rest of the ccache software. So: 
GPLv3+ (or any later version) it is.

To be perfectly clear:
  This program is free software; you can redistribute it and/or modify it under
  the terms of the GNU General Public License as published by the Free Software
  Foundation; either version 3 of the License, or (at your option) any later
  version.

  This program is distributed in the hope that it will be useful, but WITHOUT 
ANY
  WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR 
A
  PARTICULAR PURPOSE. See the GNU General Public License for more details.

Thankful for any feedback on the actual code itself, or any other 
considerations.

/Anders
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] [PATCH] Add support for coverage (compiling for gcov)

2015-02-06 Thread Anders Björklund
Joel Rosdahl wrote:

 (Sorry about the very delayed answer.)

No problem!

 The patch looks good! I plan to only apply fixes to serious bugs on 
 3.1-maint, so I'll focus on the 3.2-maint version.

That it is OK, we can backport to 3.1 downstream if needed. Will focus on 
master (and 3.2).

 Two questions:

 1. I get this test suite failure with GCC = 4.7:

 % CC=gcc-4.7 ./test.sh direct
...

 No failure with GCC = 4.6.

 I guess that the coverage (empty) test should check that the two runs 
 either both produce no test.gcno files or both produce identical test.gcno 
 files?

Right, I will need to look into this issue with a newer compiler and get back 
to you.

There's also some issues with absolute/relative paths and cache hits from when 
using a base directory...

 2. You wrote Please include these in ccache, under GNU General Public 
 License v3. Just to clarify: Do you agree to use the same license as the 
 rest of ccache does, which is GPLv3 or any later version?

Yes, I meant to use the same license as the rest of the ccache software. So: 
GPLv3+ (or any later version) it is.

/Anders

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] [PATCH] Add support for coverage (compiling for gcov)

2015-01-06 Thread Anders Björklund
Hi!

We've added support for gcc --coverage (including -fprofile-arcs and 
-ftest-coverage) to ccache.

You can find the patches, for the maint versions, over on 
https://github.com/itension/ccache:

https://github.com/itension/ccache/compare/3.1-maint...coverage-3.1.10.patch

https://github.com/itension/ccache/compare/3.2-maint...coverage-3.2.1.patch


It works by storing the .gcno file in the cache (next to the .o file), when 
using --coverage.

It also needs to hash the absolute path to the .gcda file, that is created 
later at runtime.


Please include these in ccache, under GNU General Public License v3. I am their 
author.

/Anders

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache