On 3/21/2015 2:08 PM, "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?= <[email protected]>" wrote:
On Saturday, 21 March 2015 at 19:35:02 UTC, Walter Bright wrote:
I know I shouldn't, but I'll bite. Show me the "low level C code" that
effectively uses SIMD vector registers.

You are right, you should not bite. C code is superflous, this is a general
issue with efficient parallel computations. You want to avoid dependencies
within a single register.

E.g. Take a recurrence relation and make an efficient simd implementation for
it.  You might need to try to expand the terms so you have N independent
formulas. If it uses floating point you will have to be careful about drift
between the N formulas that are computed in parallel.


I.e. there isn't low level C code that effectively uses SIMD vector registers. You have to use the auto-vectorizer, which tries to reconstruct high level operations out of C low level code, then recompile.

Reply via email to