Richard Freeman <[EMAIL PROTECTED]> posted
[EMAIL PROTECTED], excerpted below, on  Thu, 14 Sep 2006
19:45:28 -0400:

> Duncan wrote: [snip]
> 
> Hmm - no -ftree-vectorize?  Care to comment on that?  I hear that it can
> be buggy with a few packages, but I'm guessing it is worth having in
> there in general.

The gcc manpage is a bit sparse (understatement) on vectorize, but the next
entry has a bit more info.

<quote, reformatted for posting>

  -ftree-vectorize
    Perform loop vectorization on trees.

  -ftree-vect-loop-version
    Perform loop versioning when doing loop vectorization on trees.  When a
    loop appears to be vector-izable except that data alignment or data
    dependence cannot be determined at compile time then vectorized and
    non-vectorized versions of the loop are generated along with runtime
    checks for alignment or dependence to control which version is
    executed.  This option is enabled by default except at level -Os where
    it is disabled.

</quote>

I'm unclear as to what "vectorization" means as used here.  My
understanding of "vector" is as a synonym for "line", thus implying loop
unrolling of some form or another, which will increase size.  As I
explained in the grandparent, I believe such optimizations to be
counterproductive on modern processors due to the extreme cost of cache
misses as opposed to slight cycle inefficiencies.

I am however aware that vectorization has a somewhat different meaning in
programming terms than the above, but am not sufficiently educated on the
topic to make an informed choice, so I've simply left gcc to go with its
default choice given my overall stated intention of -Os.

If you can sufficiently explain the concept to me such that I
understand enough about it to feel comfortable going with other than the
default (which means I can explain why I chose it and why it won't
interfere with my overall strategy as outlined in the grandparent, or is
worth it even if it does), I'd be very grateful! =8^)

BTW, I'm also looking for a good reference on LDFLAGS.  I'm using one ATM
(LDFLAGS="-Wl,-z,now", which I naturally can explain if asked but will
skip for the moment), but have seen mention of a couple others that look
interesting, but haven't come across anything detailed enough on them to
justify further divergence from the default at this time.  man gcc just
doesn't do it, in this case. =8^(

-- 
Duncan - List replies preferred.   No HTML msgs.
"Every nonfree program has a lord, a master --
and if you use the program, he is your master."  Richard Stallman

-- 
[email protected] mailing list

Reply via email to