Re: [ccache] Why not cache link commands?

2012-09-18 Thread Mike Frysinger
On Tuesday 18 September 2012 17:07:53 Andrew Stubbs wrote:
> On 18/09/12 21:04, Mike Frysinger wrote:
> > On Tuesday 18 September 2012 08:44:29 Andrew Stubbs wrote:
> >> Clearly there are some technical challenges in doing this: we'd have to
> >> hash all the object files and libraries (a la direct mode), but those
> >> problems are surmountable, I think.
> > 
> > or just re-use build-id ...
> 
> Sorry, I'm probably being thick, but what do you mean?

the linker's --build-id and associated .note.gnu.build-id section.  you can't 
hash the entire object because it can change between compiles.  build-id lets 
you say "regardless of the hash of the entire object, we know the content that 
matters is unchanged".

> >> The linker does not use any libraries not listed with "gcc '-###'
> >> whatever".
> > 
> > mmm different gcc flags can implicitly expand into -l### or different crt
> > objects, so you can't cache linking at the compiler driver level w/out
> > re- implementing much of the guts of gcc, and even then you'd break with
> > moderately patched gcc versions.
> 
> "-###" isn't meant to be a wildcard. That's an actual GCC option. I put
> quotes around it because most shells would interpret the hashes as the
> start of a comment.

hmm, gotcha.  it does seem to include all the necessary info.  whether it's 
easy for a machine to parse across gcc versions is a diff question :).  seems 
to have changed subtly over time between 3.3.6 and 4.7.1.

> >> I'm also aware that it's not that interesting for many incremental
> >> builds, where the final link will always be different, but my use case
> >> is accelerating rebuilds of projects that my have many outputs, most of
> >> which are likely to be unaffected by small code changes. It's also worth
> >> noting that incremental builds are not the target use case for ccache in
> >> general.
> > 
> > gold should already support incremental linking (ala build-id), so i
> > don't think that's already a fixed problem

err, typo here.  s/don't//.

> As I said, the interesting use case is *not* incremental links. The
> interesting use case is accelerating "clean" builds. ccache can never
> help where genuinely new inputs are involved.

right, i was just agreeing with you and providing more details as to how it 
already works today.
-mike


signature.asc
Description: This is a digitally signed message part.
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Why not cache link commands?

2012-09-18 Thread Andrew Stubbs

On 18/09/12 21:04, Mike Frysinger wrote:

On Tuesday 18 September 2012 08:44:29 Andrew Stubbs wrote:

Clearly there are some technical challenges in doing this: we'd have to
hash all the object files and libraries (a la direct mode), but those
problems are surmountable, I think.


or just re-use build-id ...


Sorry, I'm probably being thick, but what do you mean?


The linker does not use any libraries not listed with "gcc '-###' whatever".


mmm different gcc flags can implicitly expand into -l### or different crt
objects, so you can't cache linking at the compiler driver level w/out re-
implementing much of the guts of gcc, and even then you'd break with
moderately patched gcc versions.


"-###" isn't meant to be a wildcard. That's an actual GCC option. I put 
quotes around it because most shells would interpret the hashes as the 
start of a comment.


"-###" causes gcc to print the commands that it would run, including the 
link line (well, collect2, but same difference). We can read that and 
bypass reimplementing all of gcc. As you say, without this feature we 
couldn't predict what gcc will do: the compiler wouldn't even need to be 
patched if customer specs files were used.



I'm also aware that it's not that interesting for many incremental
builds, where the final link will always be different, but my use case
is accelerating rebuilds of projects that my have many outputs, most of
which are likely to be unaffected by small code changes. It's also worth
noting that incremental builds are not the target use case for ccache in
general.


gold should already support incremental linking (ala build-id), so i don't
think that's already a fixed problem


As I said, the interesting use case is *not* incremental links. The 
interesting use case is accelerating "clean" builds. ccache can never 
help where genuinely new inputs are involved.


Andrew

___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Why not cache link commands?

2012-09-18 Thread Mike Frysinger
On Tuesday 18 September 2012 08:44:29 Andrew Stubbs wrote:
> Clearly there are some technical challenges in doing this: we'd have to
> hash all the object files and libraries (a la direct mode), but those
> problems are surmountable, I think.

or just re-use build-id ...

> The linker does not use any libraries not listed with "gcc '-###' whatever".

mmm different gcc flags can implicitly expand into -l### or different crt 
objects, so you can't cache linking at the compiler driver level w/out re-
implementing much of the guts of gcc, and even then you'd break with 
moderately patched gcc versions.

> I'm also aware that it's not that interesting for many incremental
> builds, where the final link will always be different, but my use case
> is accelerating rebuilds of projects that my have many outputs, most of
> which are likely to be unaffected by small code changes. It's also worth
> noting that incremental builds are not the target use case for ccache in
> general.

gold should already support incremental linking (ala build-id), so i don't 
think that's already a fixed problem
-mike


signature.asc
Description: This is a digitally signed message part.
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Why not cache link commands?

2012-09-18 Thread Andrew Stubbs

On 18/09/12 16:37, Justin Lebar wrote:

ldcache would hash object files and spit out linked files.  It would
use an entirely separate cache.  Its handling of command-line options
would be entirely different.  Its processing of input files would be
entirely different.  ISTM that very little would be shared.


It takes multiple input files and returns a single output file, plus 
stderr. This much is the same.


An input object file is just as hashable as an input header file, you 
just find them a different way. I think the manifest file would need 
little or no modification.


Similarly, the output file is just as cacheable. There's probably no 
need to even use a different suffix in the cache.


I've yet to get into the precise details, but I think the file discovery 
mechanism would need to be abstracted out a little, but that's the 
biggest change.


The command line parsing would need a once over, of course. The biggest 
change there is that it's more normal to list multiple input files on 
the command line, and there's no "language" to determine.



Since this is targeting a niche use-case and is a large change to
ccache, I'd be hesitant to take this change upstream, if I were Joel.


Right, as little churn as possible, and no extra overhead in the most 
common cases.


Andrew
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] permit ccache to build with clang

2012-09-18 Thread Eitan Adler
Hi,

I needed the following patch for ccache to build with clang. Without
this I get the following error:

[7905 eitan@radar ~/svn/ccache ]%gmake
 (git)-[master]-
clang -DHAVE_CONFIG_H  -DSYSCONFDIR=/usr/local/etc -I. -I.  -MD -MP
-MF .deps/main.c.d -g -O2 -Wall -W -Werror -c -o main.o main.c
clang: error: argument unused during compilation: '-I .'
clang: error: argument unused during compilation: '-I .'
gmake: *** [main.o] Error 1


commit 106e2aa8c74007c3bbca186464bc602081db094d
Author: Eitan Adler 
Date:   Tue Sep 18 12:19:10 2012 -0400

Permit ccache to build with clang

diff --git a/Makefile.in b/Makefile.in
index b561a3e..125ec89 100644
--- a/Makefile.in
+++ b/Makefile.in
@@ -19,7 +19,7 @@ LIBS = @LIBS@
 RANLIB = @RANLIB@

 all_cflags = $(CFLAGS)
-all_cppflags = @DEFS@ @extra_cppflags@ -DSYSCONFDIR=$(sysconfdir) -I.
-I$(srcdir) $(CPPFLAGS)
+all_cppflags = @DEFS@ @extra_cppflags@ -DSYSCONFDIR=$(sysconfdir) $(CPPFLAGS)
 all_ldflags = @extra_ldflags@ $(LDFLAGS)
 all_libs = @extra_libs@ $(LIBS)

diff --git a/snprintf.c b/snprintf.c
index e1b86f2..6f6a233 100644
--- a/snprintf.c
+++ b/snprintf.c
@@ -142,7 +142,7 @@
  *included throughout the project files:
  *
  * #if HAVE_CONFIG_H
- * #include 
+ * #include "config.h"
  * #endif
  * #if HAVE_STDARG_H
  * #include 
@@ -165,7 +165,7 @@
  */

 #if HAVE_CONFIG_H
-#include 
+#include "config.h"
 #endif /* HAVE_CONFIG_H */

 #if TEST_SNPRINTF

-- 
Eitan Adler
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Why not cache compile failures?

2012-09-18 Thread Andrew Stubbs

On 18/09/12 15:34, Justin Lebar wrote:

I'm looking at ways to improve compile speed, and one obvious option is to
cache compile failures. I'm thinking of certain non-called-for-link autoconf
tests, in particular.


Doesn't autoconf have a cache of its own?


Yes, but only for repeated config tests, and for incremental configures; 
it doesn't help you for clean rebuilds. You can take a copy of the 
cache, and reload it in your rebuild, but that prevents it actually 
checking if the result would change, which ccache would do.



Anyway, ccache makes running the compiler faster.  In the cause of
giving the compiler a small program to compile to test a feature,
surely running the compiler takes virtually zero time, and the
overhead is elsewhere.


OK, what I haven't said is that in the system I'm working with a cache 
miss is quite expensive. I'm using something like distcc (but not 
distcc), for a somewhat unusual setup we have.


As a proportion, a cache miss on a small compile task is more expensive 
than a large compile task. If you watch a build log scroll by you can 
actually see it pause when it gets a cache miss. It doesn't help that 
even in a parallel make the configure script tends to be serial. This is 
why I'm trying to mop up all the small jobs.


Andrew
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Why not cache link commands?

2012-09-18 Thread Justin Lebar
> What I'm looking for is more concrete
> roadblocks I haven't considered.

You'd basically have to rewrite all of ccache.

ccache hashes header files and spits out object files.

ldcache would hash object files and spit out linked files.  It would
use an entirely separate cache.  Its handling of command-line options
would be entirely different.  Its processing of input files would be
entirely different.  ISTM that very little would be shared.

Since this is targeting a niche use-case and is a large change to
ccache, I'd be hesitant to take this change upstream, if I were Joel.

-Justin

On Tue, Sep 18, 2012 at 11:27 AM, Andrew Stubbs  wrote:
> On 18/09/12 15:31, Justin Lebar wrote:
>>>
>>> So, again, before I waste my time implementing this feature, are there
>>> any
>>> other fundamental gotchas that would prevent it ever working or ever
>>> being
>>> useful?
>>
>>
>> On a large project with many inputs to ld, you'd have to hash a /lot/
>> of object files, increasing the overhead of ccache substantially.  I
>> understand that this isn't your particular use-case, but it's the
>> common one.
>
>
> Yes, that's true, but those are also the most expensive link commands, so
> maybe it's not so bad.
>
> I realise that there's some risk that a cache miss can be expensive, and
> that a cache hit might be only a very little cheaper than the real link, but
> I'm prepared to take that risk. What I'm looking for is more concrete
> roadblocks I haven't considered.
>
> Incidentally, I'm also considering the possibility of caching the hashes and
> using the inode/size/mtime etc. to short-cut that process (perhaps as a
> "sloppiness" option), not only for objects, but also for sources.
>
>
>> If you're on Linux, have you tried the gold linker?
>
>
> Let's limit this discussion to what can be done with ccache, please. I
> assure you, we know about the toolchain options.
>
> Andrew
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Why not cache link commands?

2012-09-18 Thread Andrew Stubbs

On 18/09/12 15:31, Justin Lebar wrote:

So, again, before I waste my time implementing this feature, are there any
other fundamental gotchas that would prevent it ever working or ever being
useful?


On a large project with many inputs to ld, you'd have to hash a /lot/
of object files, increasing the overhead of ccache substantially.  I
understand that this isn't your particular use-case, but it's the
common one.


Yes, that's true, but those are also the most expensive link commands, 
so maybe it's not so bad.


I realise that there's some risk that a cache miss can be expensive, and 
that a cache hit might be only a very little cheaper than the real link, 
but I'm prepared to take that risk. What I'm looking for is more 
concrete roadblocks I haven't considered.


Incidentally, I'm also considering the possibility of caching the hashes 
and using the inode/size/mtime etc. to short-cut that process (perhaps 
as a "sloppiness" option), not only for objects, but also for sources.



If you're on Linux, have you tried the gold linker?


Let's limit this discussion to what can be done with ccache, please. I 
assure you, we know about the toolchain options.


Andrew
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Why not cache compile failures?

2012-09-18 Thread Justin Lebar
> I'm looking at ways to improve compile speed, and one obvious option is to
> cache compile failures. I'm thinking of certain non-called-for-link autoconf
> tests, in particular.

Doesn't autoconf have a cache of its own?

Anyway, ccache makes running the compiler faster.  In the cause of
giving the compiler a small program to compile to test a feature,
surely running the compiler takes virtually zero time, and the
overhead is elsewhere.

-Justin
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


Re: [ccache] Why not cache link commands?

2012-09-18 Thread Justin Lebar
> So, again, before I waste my time implementing this feature, are there any
> other fundamental gotchas that would prevent it ever working or ever being
> useful?

On a large project with many inputs to ld, you'd have to hash a /lot/
of object files, increasing the overhead of ccache substantially.  I
understand that this isn't your particular use-case, but it's the
common one.

If you're on Linux, have you tried the gold linker?

-Justin

On Tue, Sep 18, 2012 at 8:44 AM, Andrew Stubbs  wrote:
> Hi all, again,
>
> I've just posted about improving compile speed by caching compiler failures,
> and in the same vein I'd like to consider caching called-for-link compile
> tasks.
>
> This is partly interesting for the many small autoconf tests, but is also
> increasingly interesting for real compilations, now that
> whole-program-optimization and link-time-optimization is more available in
> GCC. Even without all this link-time compilation activity, there are some
> link operations that simply take forever, mostly due to large file sizes.
>
> Clearly there are some technical challenges in doing this: we'd have to hash
> all the object files and libraries (a la direct mode), but those problems
> are surmountable, I think. The linker does not use any libraries not listed
> with "gcc '-###' whatever".
>
> I'm also aware that it's not that interesting for many incremental builds,
> where the final link will always be different, but my use case is
> accelerating rebuilds of projects that my have many outputs, most of which
> are likely to be unaffected by small code changes. It's also worth noting
> that incremental builds are not the target use case for ccache in general.
>
> So, again, before I waste my time implementing this feature, are there any
> other fundamental gotchas that would prevent it ever working or ever being
> useful?
>
> Has anybody else ever tried to do this? Is anybody trying to do it now?
>
> Thanks
>
> Andrew
> ___
> ccache mailing list
> ccache@lists.samba.org
> https://lists.samba.org/mailman/listinfo/ccache
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] Why not cache link commands?

2012-09-18 Thread Andrew Stubbs

Hi all, again,

I've just posted about improving compile speed by caching compiler 
failures, and in the same vein I'd like to consider caching 
called-for-link compile tasks.


This is partly interesting for the many small autoconf tests, but is 
also increasingly interesting for real compilations, now that 
whole-program-optimization and link-time-optimization is more available 
in GCC. Even without all this link-time compilation activity, there are 
some link operations that simply take forever, mostly due to large file 
sizes.


Clearly there are some technical challenges in doing this: we'd have to 
hash all the object files and libraries (a la direct mode), but those 
problems are surmountable, I think. The linker does not use any 
libraries not listed with "gcc '-###' whatever".


I'm also aware that it's not that interesting for many incremental 
builds, where the final link will always be different, but my use case 
is accelerating rebuilds of projects that my have many outputs, most of 
which are likely to be unaffected by small code changes. It's also worth 
noting that incremental builds are not the target use case for ccache in 
general.


So, again, before I waste my time implementing this feature, are there 
any other fundamental gotchas that would prevent it ever working or ever 
being useful?


Has anybody else ever tried to do this? Is anybody trying to do it now?

Thanks

Andrew
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache


[ccache] Why not cache compile failures?

2012-09-18 Thread Andrew Stubbs

Hi all,

I'm looking at ways to improve compile speed, and one obvious option is 
to cache compile failures. I'm thinking of certain non-called-for-link 
autoconf tests, in particular.


I'm aware that there's some danger here that we can end up caching 
Ctrl-C interrupts, SIGTERM/SIGKILL terminations, out-of-memory failures, 
and all manner of other non-reproducible failures, but these are the 
unusual case, and nothing that can't be fixed with CCACHE_RECACHE. I 
might suggest emitting an extra warning message that informs the user 
that they are seeing a cached failure.


Before I waste time trying to implement this, are there any other 
reasons for not doing this?


Has anybody else tried to do it already?

Thanks

Andrew
___
ccache mailing list
ccache@lists.samba.org
https://lists.samba.org/mailman/listinfo/ccache