Thank you for the very clarifying text below. Some version of it would surely be helpful in the gcc manual when introducing the -march and -mcpu/-mtune flags.
Since I needed atomic read/writes of FP variables in a multi-threaded program, I have no other option than using the slow fld/fst instructions, which is still many times faster than locking/unlocking a mutex. Peter On Saturday 12 February 2005 11:42, Marcel Cox wrote: > Peter Soetens wrote: > > I was wondering why the above gcc parameter does not enable the use > > of the fst/fld opcodes for pentium processors, while -march=i686 > > does. The Intel manuals specifically say that they can be used across > > all pentium processors. > > There are 2 options to tell the compiler about your wanted processor. > The -march=xyz option tells you the instruction set to use, while the > -mcpu=xyz option tells you for which processor the program should run > fastest. > If you supply the -march option, but not the -mcpu option, then the > compiler will assume you use the same processor for both. > > The difference in the code you see are actually due to the -mcpu > option. For your first code example, you implicitly use -mcpu=586 and > for the second example, you use -mcpu=686. So your first code is > supposed to run fastest on a Pentium class processor while your second > code is supposed to run fastest on a Pentium2 class processor. > Now, an a Pentium processor, the FLD and FST instructions are > (relatively) expensive. So the compiler decides it is faster to do > load/store operations using integer registers. On Pentium2 class > processors, the FLD and FST instructions are much faster, and now the > compiler considers it worthwhile to use them. > > Now if you want to generate code that will be guaranteed to run on > Pentium processors, but runs best on Pentium2 class processors, you > have to use both the options -march=pentium and -mcpu=pentium2 (you can > also use 586 and 686 which are aliases, but I would recommend you to > use real processor names) > Of course, as Pentium2 processors are not so common any more either, > you can also tune your code for Pentium4 using -mcpu=pentium4, or for > AMD Athlon processors using -mcpu=athlon or some specific athlon model. > > Note that on newer versions of GCC (starting with 3.4.0), the -mcpu > option has been deprecated and replaced by the -mtune option to be > consistent with other processor architectures supported by GCC. -- ------------------------------------------------------------------------ Peter Soetens, Research Assistant http://www.orocos.org Katholieke Universiteit Leuven Division Production Engineering, tel. +32 16 322773 Machine Design and Automation fax. +32 16 322987 Celestijnenlaan 300B [EMAIL PROTECTED] B-3001 Leuven Belgium http://www.mech.kuleuven.ac.be/pma ------------------------------------------------------------------------