-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

Duncan wrote:
> 
> I was somewhat aware of that, but hadn't considered the effect on loops,
> and don't understand it enough to be able explain it as you did, nor enough
> to grok why if it's so much more efficient, gcc doesn't do it by default
> at least on archs sufficiently specified to know the instructions are
> there and that it makes sense. 

- From my reading tree-vectorize attempts to use SIMD wherever possible
for parallel computation of arrays/etc.  In theory that should almost
always be a net-benefit with few drawbacks.

The problem I understand is that it is sometimes a bit buggy - ie it
sometimes creates broken code.  I think these issues have mostly been
fixed, but that would explain why it is not applied by default.

There is also an -ftree-vectorize-verbose=# parameter which generates
informational messages about why particular loops were or were not
vectorized.  In theory this can help you develop more-easily-optimized code.

MMX/etc can tremendously improve program speed.  Anytime you can do 4
operations per cycle vs 1 you're going to improve throughput.

Otherwise, I agree with you as far as reducing memory footprint goes -
I've been running -Os for ages and I'm very happy with this.  My RAM is
better applied to disk caching than storing unrolled loops in almost all
cases.  I'm sure in niche cases the opposite is true, but the same
applies to -ffast-math and other dangerous optimizations.  They should
probably be applied on a per-file basis by the developer, and not across
an entire build/system.

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.5 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFFCzL2G4/rWKZmVWkRAkzOAKC1jKik3Q+JdWvpH3qkmMfvWZ823gCeP5km
odH1v8qKb4xrDL5YPLeC62o=
=NnSA
-----END PGP SIGNATURE-----

Attachment: smime.p7s
Description: S/MIME Cryptographic Signature

Reply via email to