On Sunday, 22 March 2015 at 03:43:33 UTC, Walter Bright wrote:
On 3/21/2015 2:08 PM, "Ola Fosheim =?UTF-8?B?R3LDuHN0YWQi?=
<[email protected]>" wrote:
On Saturday, 21 March 2015 at 19:35:02 UTC, Walter Bright
wrote:
I know I shouldn't, but I'll bite. Show me the "low level C
code" that
effectively uses SIMD vector registers.
You are right, you should not bite. C code is superflous, this
is a general
issue with efficient parallel computations. You want to avoid
dependencies
within a single register.
E.g. Take a recurrence relation and make an efficient simd
implementation for
it. You might need to try to expand the terms so you have N
independent
formulas. If it uses floating point you will have to be
careful about drift
between the N formulas that are computed in parallel.
I.e. there isn't low level C code that effectively uses SIMD
vector registers. You have to use the auto-vectorizer, which
tries to reconstruct high level operations out of C low level
code, then recompile.
at least it still compiles into efficient code for the PDP11?