Re: SIMD support...

Peter Alexander Sun, 08 Jan 2012 09:50:50 -0800

On 8/01/12 5:02 PM, Martin Nowak wrote:

simdop will need more overloads, e.g. some
instructions need immediate bytes.
z = simdop(SHUFPS, x, y, 0);


How about this:
__v128 simdop(T...)(SIMD op, T args);


These don't make a lot of sense to return as value, e.g.

__v128 a, b;
a = simdop(movhlps, b); // ???

movhlps moves the top 64-bits of b into the bottom 64-bits of a. Can'tbe done as an expression like this.

Would make more sense to just write the instructions like they appear inasm:


simdop(movhlps, a, b);
simdop(addps, a, b);
etc.

The difference between this and inline asm would be:

1. Registers are automatically allocated.
2. Loads/stores are inserted when we spill to stack.
3. Instructions can be scheduled and optimised by the compiler.

We could then extend this with user-defined types:

struct float4
{
  union
  {
     __v128 v;
     float[4] for_debugging;
  }

  float4 opBinary(string op:"+")(float4 rhs) @forceinline
  {
    __v128 result = v;
    simdop(addps, result, rhs);
    return float4(result);
  }
}

We'd need a strong guarantee of inlining and removal of redundantload/stores though for this to work well. We'd also need a guaranteethat float4's would get the same treatment as __v128 (as it is the onlyelement).

Re: SIMD support...

Reply via email to