Re: Not as much ccache win as I expected

2008-06-15 Thread Jörn Engel
On Fri, 13 June 2008 14:10:29 -0700, Tim Bird wrote:
> 
> Maybe I should just be grateful for any ccache hits I get.

ccache's usefulness depends on your workload.  If you make a change to
include/linux/fs.h, close to 100% of the kernel is rebuilt, with or
without ccache.  But when you revert that change, the build time differs
dramatically.  Without ccache, fs.h was simply changed again and
everything is rebuild.  With ccache, there are hits for the old version
and all is pulled from the cache - provided you have allotted enough
disk for it.

If you never revert to an old version or do some equivalent operation,
ccache can even be a net loss.  On a fast machine, the additional disk
accesses are easily more expensive than the minimal cpu gains.

Jörn

-- 
Public Domain  - Free as in Beer
General Public - Free as in Speech
BSD License- Free as in Enterprise
Shared Source  - Free as in "Work will make you..."
--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: about size optimizations (Re: Not as much ccache win as I expected)

2008-06-15 Thread Oleg Verych
> You can do this without changin the Makefile, if you provide suitable
> scripts on $PATH for the make.

I want to add here whole issue of kbuild's way of dependency
calculation and rebuild technique.

1) This whole infrastructure is needed only for developers. But
developer while writing/updating  some code must know what is changed
and how it impacts all dependent/relevant code. Thus, one must create
list of all files *before* doing edit/build/run cycles (even with
git/quilt aid). And this list must be fed to build system to make sure
everything needed is rebuilt, and anything else is not (to save time).

This is matter of organizing tools and ways of doing things -- a very
important feature of doing anything effectively.

2) OTOH user needs no such thing at all. New kernel -- new build from
scratch. Distros are same. Also blind belief for correct rebuild using
old object pool is a naive thing.

3) Testers applying and testing patches. OK, now it's a rule to have
diffstat, thus list of changed files. But one can filter out them from
diff/patch with `sed` easily. It can be done even rejecting pure
whitespace/comment changes.

Now you have list of files, feed them to build system, like in (1). No
`make` (recursive or not, or whatever) is needed (use ccache-like
thing in general case to save build time). Its key-thing -- timestamps
-- is a lock for development somehow overcame by `make`-based kbuild
2.6. What an irony.

Problems:

* more flexible source-usage (thus dependency) tracking is needed
(per-variable, per-function, per-file). This must not be a random
comments near #include, it must be natural part of source files
themselves. Filenames are not subject to frequent changes. Big ones
can be split, but main prefix must be the same, thus no need of
changing it in all users. Small "ENOENT || prefix*" heuristics is
quite OK here.

* implemented features and their options must be described and
documented in-place in sources (distributed configuration). Licence
blocks are not needed, one has top file with it or MODULE_LICENSE().
Describe your source in a form, that will be easily parse-able for
creating dependency and configuration items/options.

* once all this in place, creating specific config sets by end users
must not be so painful for both sides as it now is.

#include's && #ifdef's are proven PITA; flexible text processing
(analysis, transformations) with basic tools like `sed` (or `perl`) is
the right way IMHO. On this stage no `gcc -E` for working `cat $all
>linux.c` is needed.

(My another stone to "The art of thinking in `make` and C". Hope, it's
constructive. Again all this i see as handled with very small set of
universal scripts.)
-- 
sed 'sed && sh + olecom = love'  <<  ''
-o--=O`C
 #oo'L O
<___=E M
--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: about size optimizations (Re: Not as much ccache win as I expected)

2008-06-15 Thread Jamie Lokier
David Woodhouse wrote:
> On Sat, 2008-06-14 at 10:56 +0100, Oleg Verych wrote:
> > I saw that. My point is pure text processing. But as it seems doing
> > `make` is a lot more fun than to do `sh` && `sed`.
> 
> The problem is that it _isn't_ pure text processing. There's more to
> building with --combine than that, and we really do want the compiler to
> do it.
> 
> _Sometimes_ you can just append C files together and they happen to
> work. But not always. A simple case where it fails would be when you
> have a static variable with the same name in two different files.

I suspect the simplest way to adapt an existing makefile is:

1. Replace each compile command "gcc args... file.c -o file.o"
   with "gcc -E args... file.c -o file.o.i".

2. Replace each incremental link "ld -r -o foo.o files..." with
   "cat `echo files... | sed 's/$/.i/'` > foo.o.i".

3. Similar replacement for each "ar" command making .a files.

4. Replace the main link "ld -o vmlinux files..." with
   "gcc -o vmlinux --combine -fwhole-program `echo files... | sed 
's/$/.i/'`".

You can do this without changin the Makefile, if you provide suitable
scripts on $PATH for the make.

-- Jamie
--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: about size optimizations (Re: Not as much ccache win as I expected)

2008-06-14 Thread Oleg Verych
> _Sometimes_ you can just append C files together and they happen to
> work. But not always. A simple case where it fails would be when you
> have a static variable with the same name in two different files.

Afaik in the kernel global static variables are not appreciated much. In any
case file scope can be easily added by s/// to its name in text compile
stage.

There are much more problems with conditional includes and other source
configuration crutches.

> The compiler will do the right thing there., while naïve concatenation
> of C files will not.

That was an example, of course there must be pre-cc text processing stage.

> Of course, it's _possible_ to have external text processing cope with
> this case somehow -- you'd probably feed it through the preprocessor,
> then look at the output of the preprocessor and make the variable names
> unique, perhaps?

Even before cpp. But twisted includes/ifdef's cannot be handled without it.

> And then move on to the next case which is already handled in gcc...

To gain size reduction, some register-wide static variables (ints), which are
usually for some state-handling, can be glued together, if whole picture
permits: all flags fit in limited bit range, needed shift is added textually.

Again developer doing clear/documented semantics, text-based
transformations is needed. Can it be done by GCC optimizing stages?
(However Rusty may try to do that with cpp :)

> But really, I'd rather just leave it to the compiler. And it's not
> because I have some masochistic fascination with makefiles :)
-- 
sed 'sed && sh + olecom = love' << ''
-o--=O`C
 #oo'L O
<___=E M
--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: about size optimizations (Re: Not as much ccache win as I expected)

2008-06-14 Thread David Woodhouse
On Sat, 2008-06-14 at 10:56 +0100, Oleg Verych wrote:
> I saw that. My point is pure text processing. But as it seems doing
> `make` is a lot more fun than to do `sh` && `sed`.

The problem is that it _isn't_ pure text processing. There's more to
building with --combine than that, and we really do want the compiler to
do it.

_Sometimes_ you can just append C files together and they happen to
work. But not always. A simple case where it fails would be when you
have a static variable with the same name in two different files.

The compiler will do the right thing there., while naïve concatenation
of C files will not.

Of course, it's _possible_ to have external text processing cope with
this case somehow -- you'd probably feed it through the preprocessor,
then look at the output of the preprocessor and make the variable names
unique, perhaps? And then move on to the next case which is already
handled in gcc...

But really, I'd rather just leave it to the compiler. And it's not
because I have some masochistic fascination with makefiles :)

-- 
dwmw2

--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: about size optimizations (Re: Not as much ccache win as I expected)

2008-06-14 Thread Oleg Verych
David Woodhouse:
> On Fri, 2008-06-13 at 22:52 +0100, Oleg Verych wrote:
>> Using same `gcc -E` principle, I once had a dream to create
>> build with something like "whole-kernel-at-time" optimising
>> compiler option:
>
> Doing it for the whole kernel probably doesn't buy you a whole lot more
> than doing it a bit more selectively, which is what I was doing in
> http://lkml.org/lkml/2006/8/24/212

I saw that. My point is pure text processing. But as it seems doing
`make` is a lot more fun than to do `sh` && `sed`. Latter, however, is basic
tool (`make` is system), which requires knowledge, experience and
commitment. Original `compilercache` (from its ideas ccache was
implemented in C) was simple shell script, and it beats `make` very hard in
general case. Even flex-based C semi-parser there is simple task for `sed`.

Understanding of non trivial `sed` scripts is also an issue as well as
maintaining. But if this creating and maintaining of let's say
"source profiles" is off-loaded for users, in the realm kernel developers
call the "wild", then things can be much easier.

If there was more than one kbuild developer or wide and skilled
community, then same ccache scheme with some kconfig fixes in form
of simple shell script could be available for kernel builds, including
external modules. If fact this is how up-to-date kbuild+kconfig can be
used and developed easily by other projects (klibc, busybox, ...).

It's nice to see how big Makefile now is going to be split, however.

> I think Segher has been playing with it a bit recently, and confirms my
> suspicion that combining kernel/ with arch/$ARCH/kernel, and mm/ with
> arch/$ARCH/mm, is also a big win.

The C with dumb #includes and #ifdef's is very-very obsolete technology.
Much more flexibility can be achieved with text processing, if size-
optimizing source annotations/transformation schemes, based on
human-developer knowledge, user's source profiles can be used.

> The GCC problems should mostly be fixed now, I think -- we just need to
> have another go at doing the Kbuild side of it properly.

One don't need to beg GCC developers for every feature, bug fix, that
kernel developers can actually use. Now almost nothing can be done without
compiler support.

One example: returning values && error codes using CPU/GPIO flags, thus
reducing size and CPU load.
-- 
sed 'sed && sh + olecom = love'  <<  ''
-o--=O`C
 #oo'L O
<___=E M
--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: about size optimizations (Re: Not as much ccache win as I expected)

2008-06-14 Thread Adrian Bunk
On Sat, Jun 14, 2008 at 08:43:23AM +0100, David Woodhouse wrote:
> On Fri, 2008-06-13 at 22:52 +0100, Oleg Verych wrote:
> > Using same `gcc -E` principle, I once had a dream to create
> > build with something like "whole-kernel-at-time" optimising
> > compiler option:
> 
> Doing it for the whole kernel probably doesn't buy you a whole lot more
> than doing it a bit more selectively, which is what I was doing in
> http://lkml.org/lkml/2006/8/24/212

For the interesting CONFIG_MODULES=n case it most likely can give you 
much smaller code.

But Denys had section garbage collection patches, and combining the 
per-module compilation with section garbage collection might get near
at the results of compiling the whole kernel at once?

> I think Segher has been playing with it a bit recently, and confirms my
> suspicion that combining kernel/ with arch/$ARCH/kernel, and mm/ with
> arch/$ARCH/mm, is also a big win.
>...

The big problem I see here is that we lose the link order.

Not unfixable, but quite nasty to sort out.

> dwmw2

cu
Adrian

-- 

   "Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
   "Only a promise," Lao Er said.
   Pearl S. Buck - Dragon Seed

--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: about size optimizations (Re: Not as much ccache win as I expected)

2008-06-14 Thread David Woodhouse
On Fri, 2008-06-13 at 22:52 +0100, Oleg Verych wrote:
> Using same `gcc -E` principle, I once had a dream to create
> build with something like "whole-kernel-at-time" optimising
> compiler option:

Doing it for the whole kernel probably doesn't buy you a whole lot more
than doing it a bit more selectively, which is what I was doing in
http://lkml.org/lkml/2006/8/24/212

I think Segher has been playing with it a bit recently, and confirms my
suspicion that combining kernel/ with arch/$ARCH/kernel, and mm/ with
arch/$ARCH/mm, is also a big win.

The GCC problems should mostly be fixed now, I think -- we just need to
have another go at doing the Kbuild side of it properly.

-- 
dwmw2

--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


about size optimizations (Re: Not as much ccache win as I expected)

2008-06-13 Thread Oleg Verych
>> And what kinds of source/kconfig changes are made for every build?
>
> I start with a baseline config for an embedded board, then
> alter, one at a time, individual config items related to kernel size.
> No source changes are made.

Using same `gcc -E` principle, I once had a dream to create
build with something like "whole-kernel-at-time" optimising
compiler option:

for file in $all_core_files
do gcc $opt -E $file >>core_kernel.c
done && gcc $opt code_kernel.c -o vmlinux.c.o

# same for some special parts and asm
do_foo
# do final link
do_vmlinux

I've had something like that once for `dash` and had few
percents of size reduction. It would be interesting to
implement and check this in linux.

Also i've many things to point-tune/point-remove based on
usage patterns/source patterns, which can be easily removed by
stream text editor from sources. These like not needed
* syscalls,sysctl, ioctls
* fields in data structures/sources for handling them
* code branches or code blocks which are known (in particular
board / embedded case) to be useless etc. etc.

But all this requires non trivial source text editor or visual tools
for easy analysis, navigation, marking, RE generation and
build + run testing.

With mid-70 command line and `make` (even kbuild version)
or 20++ years old technology of text editors, it's not that trivial
to accomplish.

Some kind of IDE for kernel development is needed. Here
one will have all whitespace and code style policy, static
checks for stupid security holes, C mis-use, kernel API
mis-use, all those crutches in linux/scripts/* in one place
and applied right away.

Sorry, for this rant, but i see no developemnt here at all.
New schedulers, slab allocators, file systems are great.
But all this has nothing to do with fundamental developent
of the tools. Hardware is quite fast and have pretty much
of RAM/ROM/caches now, yet software just bloats.

The streaming, pre-cc text editing is the key, fine GCC is not.
-- 
sed 'sed && sh + olecom = love'  <<  ''
-o--=O`C
 #oo'L O
<___=E M
--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Not as much ccache win as I expected

2008-06-13 Thread Tim Bird
Oleg Verych wrote:
> And what kinds of source/kconfig changes are made for every build?

I start with a baseline config for an embedded board, then
alter, one at a time, individual config items related to kernel size.
No source changes are made.

I do full removal of the kernel source tree and build area
before the start of each test.

> (any versions, e.g. localversion, .version, aren't important, they are for
> modules ko and vmlinux, afaik)
Ok - this is helpful.

> kbuild is `ccache` on itself. Every *.o.cmd is kind of info `ccache`
> hashes (except things like stderr, gcc version) to check repeated
> rebuilds.
Yeah, I'm pretty impressed with how well kbuild avoids rebuilding
stuff in the first place.

Maybe I should just be grateful for any ccache hits I get.

Thanks,
 -- Tim

=
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=

--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: Not as much ccache win as I expected

2008-06-13 Thread Oleg Verych
Tim Bird @ Fri, 13 Jun 2008 12:06:05 -0700:

> I'm running an automated test which does numerous compiles
> of the Linux kernel.  One of the things I do is create a localversion
> file at the root of the kernel source tree with a unique identifier
> that I use later on in testing.

And what kinds of source/kconfig changes are made for every build?
(any versions, e.g. localversion, .version, aren't important, they are for
modules ko and vmlinux, afaik)

[...]
> Is there anything else obvious which is prevents ccache from
> working well with a kernel build (that is, anything else that
> would, for otherwise identical C files with a similar build,
> cause a difference?)

kbuild is `ccache` on itself. Every *.o.cmd is kind of info `ccache`
hashes (except things like stderr, gcc version) to check repeated
rebuilds. Also kconfig<->kbuild link via header magic may confuse
general-purpose `ccache`.

For rebuilds of the same codebase, it's better to use separate
kbuild object output directories.

> Any tips would be appreciated.

Just my handwaving, but test with couple core config symbols
toggling, shows only one `ccache` hit.

[EMAIL PROTECTED]:/mnt/zdev0/blinux$ CCACHE_DIR=_ccache/ ccache -s
cache directory _ccache/
cache hit   1133
cache miss  1141
called for link   28
not a C/C++ file  64
no input file282
files in cache  2282
cache size  15.7 Mbytes
max cache size 976.6 Mbytes
[EMAIL PROTECTED]:/mnt/zdev0/blinux$ make menuconfig # toggle
[EMAIL PROTECTED]:/mnt/zdev0/blinux$ diff -u1 .config.old .config
--- .config.old 2008-06-13 22:52:37.0 +0200
+++ .config 2008-06-13 23:01:07.0 +0200
@@ -3,3 +3,3 @@
 # Linux kernel version: 2.6.24
-# Fri Jun 13 22:52:37 2008
+# Fri Jun 13 23:01:07 2008
 #
@@ -67,3 +67,4 @@
 CONFIG_POSIX_MQUEUE=y
-# CONFIG_BSD_PROCESS_ACCT is not set
+CONFIG_BSD_PROCESS_ACCT=y
+CONFIG_BSD_PROCESS_ACCT_V3=y
 # CONFIG_TASKSTATS is not set
[EMAIL PROTECTED]:/mnt/zdev0/blinux$
[EMAIL PROTECTED]:/mnt/zdev0/blinux$ CCACHE_DIR=_ccache/ ccache -s
cache directory _ccache/
cache hit   1134
cache miss  1879
called for link   33
not a C/C++ file  79
no input file401
files in cache  3758
cache size  28.1 Mbytes
max cache size 976.6 Mbytes
[EMAIL PROTECTED]:/mnt/zdev0/blinux$

-- 
sed 'sed && sh + olecom = love'  <<  ''
-o--=O`C
 #oo'L O
<___=E M
--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Not as much ccache win as I expected

2008-06-13 Thread Tim Bird
I'm running an automated test which does numerous compiles
of the Linux kernel.  One of the things I do is create a localversion
file at the root of the kernel source tree with a unique identifier
that I use later on in testing.

I started using ccache to improve the performance of my builds,
but found that the hit rate on the cache was not very good.
 $ ccache -s
cache directory /home/tbird/.ccache
cache hit  74416
cache miss 59400
called for link87252
compile failed21
not a C/C++ file  143449
no input file  49336
files in cache 42844
cache size   1.8 Gbytes
max cache size   2.0 Gbytes

Thinking that the problem might be having a unique version for
every build (and that this change flowed to every file via the
version.h file), I tried building without this change.  I saw
an improvement, but not much.

Is there anything else obvious which is prevents ccache from
working well with a kernel build (that is, anything else that
would, for otherwise identical C files with a similar build,
cause a difference?)

Any tips would be appreciated.
 -- Tim

=
Tim Bird
Architecture Group Chair, CE Linux Forum
Senior Staff Engineer, Sony Corporation of America
=

--
To unsubscribe from this list: send the line "unsubscribe linux-embedded" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html