On Sun December 4 2005 6:37 am, Kristian Poul Herkild wrote:
> Robert Crawford wrote:
> > On Sun December 4 2005 4:11 am, Kristian Poul Herkild wrote:
> >
> >
> >
> > -mfpmath=sse is not a good idea, the consensus is it actually lowers
> > performance. -msse -mmmx -m3dnow are redundant (implied by
> > -march=athlon-xp), and should be removed from your cflags line, but
> > SHOULD be placed in your USE= line, wthout the - sign, like this:
> >
> > USE="mmx 3dnow sse"
> >
> > If you use gcc-3.4.4, these flags should work fine (I've used them for a
> > long time- no problems).
> >
> > CFLAGS="-march=athlon-xp -O3 -pipe -fomit-frame-pointer -fweb -ftracer
> > -fprefetch-loop-arrays -ffast-math -falign-functions=64 -fno-ident"
> >
> > CXXFLAGS="${CFLAGS} -fvisibility-inlines-hidden"
>
> Hmm... according to this thread
> http://forums.gentoo.org/viewtopic.php?t=43648 and the GCC manual -march
> does not imply -mmx -msse -m3dnow, nor does it imply mfpmath=sse. I know
> of no consensus of -mfpmath=sse lowering performance. Actually, I only
> know of the opposite from the LFS-community as well as Gentoo Wiki.
>
> I don't want to start a flamewar on this, so if you have other and more
> correct information than me, then please share it :)
>
> -Kristian Poul Herkild
No flame war- if my conclusions/understanding is incorrect, I'd love to know,
and make corrections!
I think that almost 3 year old thread refers to -march=cpu ( now deprecated
for -mtune), not -march=athlon-xp (the actual architecture). -march="cpu
type" or -mtune generates not only code for say, athlon-xp, but also code for
the entire family of i386 cpus. Thus the resulting binary is functional with
different older cpus.
On the other hand, -march=athlon-xp generates only code that works with an
athlon-xp cpu, thus would be more "tuned" to that cpu (less bloat). At least
that's the theory- why compile in code you don't need and use for other cpus?
My understanding of man gcc is that -march=athlon-xp does enable mmx 3dnow
sse support.
In other words, from a "freshmeat" article:
-------------------------------------------------------------------------------------------------
"-march implies -mcpu, so when you use -march, there's no need to use -mcpu.
-mcpu generates code tuned for the specified CPU, but it does not alter the
ABI and the set of available instructions, so you can still run the resulting
binary on other CPUs.
When you use -march, you generate code for the specified machine type, and
the available instructions will be used, which means that you probably cannot
run the binary on other machine types."
----------------------------------------------------------------------------------------------------
For example, from this thread http://forums.gentoo.org/viewtopic.php?p=275851,
page 3, bottom:
"If I compile with -march=athlon-xp, sse, 3dnow, and mmx are enabled (through
the -D__athlon_sse__ -D__tune_athlon__ -D__tune_athlon_sse__ -D__SSE__
-D__MMX__ -D__3dNOW__ -D__3dNOW_A__ macros). When I add, for example -mmmx,
-mno-mmx appears after -mmmx in the "options enabled" list in the output of
gcc -Q -v -march=athlon-xp -mmmx. However, -D__MMX__ doesn't go away, so MMX
is still used. In short -mmmx, -msse, and -m3dnow are unneccessary, but they
don't hurt.undefined"
Over the years, I've read similar statements by experienced people on hundreds
of posts on many forums and groups- sorry I can't point you to them off the
top of my head. If you can wade through the huge cflags central Gentoo forum
threads (an ordeal in itself), you will probably reach the same conclusions I
have.
Also, as I understand it from some recent posts, compiling in mmx 3dnow sse
support is pointless bloat in any programs that don't use it, thus putting
them in USE= makes much more sense than cflags.
As for-mfpmath=sse, every benchmark testing article (and several more recent
forum posts I've seen indicate no real performance gain, and in many cases,
degraded performance, at least with AMD cpus. That's contrary to what man gcc
generally says, but people who have actually run tests tend to disagree. Keep
in mind that the version of gcc used and cpu type (AMD or INTEL) also
influences the results. However, If anyone knows of more recent info on this
flag, please post a link.
Robert Crawford
--
[email protected] mailing list