On 26 Nov, Matti Rintala wrote:
>>Looking at the source of gcc (toplev.c):
>> -fexpensive-optimizations is activated with -O2 or higher,
>> -fschedule-insns2 might be also activated with -O2 or higher (I think
>> it is),
>> -O6 didn't exist (3 is the highest value for gcc).
>>
>>You might also try if adding "-fmove-all-movables -freduce-all-givs
>>-fsched-interblock -fbranch-count-reg -fforce-addr" does something valuable.
>>
>
> I made some checks using gcc 3.0.2 (compiled by myself using default
> configuration for Linux) on my Duron. I made tests, and here are results:
[list of options]
> 4) So, -fsched-interblock and -fbranch-count-reg are also automatically
> enabled with -O3 (according to docs also)
Oops, I had forgot to say, that the list I have is based upon a pre 3.0
snapshot I looked at some time ago.
> 5) I also tried playing with -fmove-all-movables -fsched-spec-load
> -fforce-addr -freduce-all-givs -foptimize-sibling-calls. Their effect
> was very small. It seemed that -fforce-addr actually slowed encodeing
> down 1%, while -fmove-all-movables made it faster the same amount.
> However, I'm very sceptical about that small differences. On my Duron,
> -malign-double made encoding ~1% faster. The differences were so small
Thanks for the report. If you want to make some more tests: have a look
at often used structures and try to rearrange the members of them so
they are already aligned (-malign-* puts gaps into structures, this
wastes space in the CPU cache, reordering structures may minimize the
wasted space and because of improved cache usage the code may get
faster).
> that I ended up trusting gcc's developers and set my options to
> "-march=athlon -mcpu=athlon -malign-functions=4 -malign-double". And
> even there, gcc docs say that -march=athlon implies -mcpu=athlon, but I
> seem to remember there was some discussion about it on this list so I
> left both options.
At least with gcc 2.95.x there's a difference. Compile the attached
source with and without the -mcpu option, it will show you if there's
still a difference.
Bye,
Alexander.
--
Loose bits sink chips.
http://www.Leidinger.net Alexander @ Leidinger.net
GPG fingerprint = C518 BC70 E67F 143F BE91 3365 79E2 9C60 B006 3FE7
#include <stdio.h>
int main(void)
{
#if defined(__athlon__)
puts("__athlon__");
#endif
#if defined(__tune_athlon__)
puts("__tune_athlon__");
#endif
#if defined(__k6__)
puts("__tune_k6__");
#endif
#if defined(__tune_k6__)
puts("__tune_k6__");
#endif
#if defined(__pentiumpro__)
puts("__pentiumpro__");
#endif
#if defined(__tune_pentiumpro__)
puts("__tune_pentiumpro__");
#endif
#if defined(__i686__)
puts("__i686__");
#endif
#if defined(__tune_i686__)
puts("__tune_i686__");
#endif
#if defined(__pentium__)
puts("__pentium__");
#endif
#if defined(__tune_pentium__)
puts("__tune_pentium__");
#endif
#if defined(__i586__)
puts("__i586__");
#endif
#if defined(__tune_i586__)
puts("__tune_i586__");
#endif
#if defined(__i486__)
puts("__i486__");
#endif
#if defined(__tune_i486__)
puts("__tune_i486__");
#endif
#if defined(__i386__)
puts("__i386__");
#endif
#if defined(__tune_i386__)
puts("__tune_i386__");
#endif
exit(0);
}