Thank you for the very clarifying text below. Some version of it would surely 
be helpful in the gcc manual when introducing the -march and -mcpu/-mtune 
flags. 

Since I needed atomic read/writes of FP variables in a multi-threaded program, 
I have no other option than using the slow fld/fst instructions, which is 
still many times faster than locking/unlocking a mutex.

Peter

On Saturday 12 February 2005 11:42, Marcel Cox wrote:
> Peter Soetens wrote:
> > I was wondering why the above gcc parameter does not enable the use
> > of the fst/fld opcodes for pentium processors, while -march=i686
> > does. The Intel manuals specifically say that they can be used across
> > all pentium processors.
>
> There are 2 options to tell the compiler about your wanted processor.
> The -march=xyz option tells you the instruction set to use, while the
> -mcpu=xyz option tells you for which processor the program should run
> fastest.
> If you supply the -march option, but not the -mcpu option, then the
> compiler will assume you use the same processor for both.
>
> The difference in the code you see are actually due to the -mcpu
> option. For your first code example, you implicitly use -mcpu=586 and
> for the second example, you use -mcpu=686. So your first code is
> supposed to run fastest on a Pentium class processor while your second
> code is supposed to run fastest on a Pentium2 class processor.
> Now, an a Pentium processor, the FLD and FST instructions are
> (relatively) expensive. So the compiler decides it is faster to do
> load/store operations using integer registers. On Pentium2 class
> processors, the FLD and FST instructions are much faster, and now the
> compiler considers it worthwhile to use them.
>
> Now if you want to generate code that will be guaranteed to run on
> Pentium processors, but runs best on Pentium2 class processors, you
> have to use both the options -march=pentium and -mcpu=pentium2 (you can
> also use 586 and 686 which are aliases, but I would recommend you to
> use real processor names)
> Of course, as Pentium2 processors are not so common any more either,
> you can also tune your code for Pentium4 using -mcpu=pentium4, or for
> AMD Athlon processors using -mcpu=athlon or some specific athlon model.
>
> Note that on newer versions of GCC (starting with 3.4.0), the -mcpu
> option has been deprecated and replaced by the -mtune option to be
> consistent with other processor architectures supported by GCC.

-- 
------------------------------------------------------------------------
Peter Soetens, Research Assistant                  http://www.orocos.org
Katholieke Universiteit Leuven
Division Production Engineering,                      tel. +32 16 322773
Machine Design and Automation                         fax. +32 16 322987
Celestijnenlaan 300B                   [EMAIL PROTECTED]
B-3001 Leuven Belgium                 http://www.mech.kuleuven.ac.be/pma
------------------------------------------------------------------------

Reply via email to