Tuukka Toivonen wrote:
> > I suggest you to learn and use the gcc inline asm. The way gcc implements
> > inline gcc is so far the best. It allow gcc to optimize out everything as
> > best.
>
> Yes, except that I happen to hate AT&T syntax ;)
AT&T syntax has the advantage that it's not Intel specific. Also, it's
what GNU as uses.
> My test system: Pentium 120 MHz, 24 MB main memory, 32 MB
> swap, Linux 2.0.34, gcc version 2.7.2. There were no other
> active programs background eating CPU-time, but the
> hard disk rotated few times showing that not everything
> fit in the disk cache.
Which could make the results unreliable.
> Considerations:
> - All libc calls used conventional stack parameter passing
> convention. This could be changed by breaking compatibility.
> - Why kernel doesn't use register parameters?? It would be
> ideal since it wouldn't break compatibility!
Can gdb deal with register parameters? It would be pretty hard to deal
with the case where a register parameter has to be stored in memory
temporarily whilst the register is used for something else. With the
standard calling convention, all parameters are always accessible via
ebp.
> Can we think this test closes the case? I don't think. Especially
> that the case 5 gives so much better performance than any other
> case make me suspecting that a lot more testing (of different
> real-life programs) is needed.
>
> Surprise, surprise: case 2 is faster than case 1!
Did you try an untimed run first, to get the file cached?
I notice that case 1 is the slowest of all. If the most frequently
called functions were inlined, I would have expected it to be the
fastest, as inline functions would be faster than a function call,
whatever the calling convention.
Also, are function calls likely to be a significant overhead for
bzip2?
Aside: it might be worth examining the effect of -funroll-loops.
Depending on the size of the resulting code relative to the size of
the cache, it can slow things down.
--
Glynn Clements <[EMAIL PROTECTED]>