On Fri, 15 Sep 2006 16:47:14 +0000 (UTC), you wrote: > I am however aware that vectorization has a somewhat different meaning in > programming terms than the above, but am not sufficiently educated on the > topic to make an informed choice, so I've simply left gcc to go with its > default choice given my overall stated intention of -Os.
Older super-computers, especially those designed or inspired by Seymour Cray Cray, included "vector registers". These were multiple registers (typically a small power of 2, say 32, 64, or 128) that could be manipulated as a unit. This was an earlier form of SIMD. By issuing a single instruction -- such as a vector load or vector add -- you repeated the same operation on a sequence of operands. The crucial difference was that these vector operations had some start-up overhead and then ran autonomously delivering one result every clock tick for the length of the vector register. While some compilers added proprietary language extension to support vector values as actual data type, most numeric code was written in scalar form. To make such super-computers useful it was crucial that compilers be able to recognize when a scalar loop could be implemented using the machine's vector facilities. Fundamentally this came down to figuring out when successive loop iterations were independent and hence could execute in parallel. Since the compiler was attempting to re-express scalar loops as loops using vector primitives the optimization became know as "vectorization". In essence a multi-media SIMD mechanism is very similar. A 64 bit register containing 4 16-bit operands is essentially a length 4 "vector register". Finding opportunities to use such SIMD instructions in scalar code requires exactly the same forms of analysis and optimization. /john -- [email protected] mailing list
