Re: SIMD support...

Peter Alexander Sat, 07 Jan 2012 18:20:21 -0800

On 8/01/12 1:32 AM, Manu wrote:

On 8 January 2012 02:54, Peter Alexander <[email protected]
<mailto:[email protected]>> wrote:


    I agree with Manu that we should just have a single type like __m128
    in MSVC. The other types and their conversions should be solvable in
    a library with something like strong typedefs.


Walter put in a reasonable effort to sway me to his side of the fence
last night. I'm still not entirely sold that implementation inside the
language is necessary to achieve these details, but I don't have enough
background into to argue, and I'm not the one that has to maintain the
code :)

Here are some points we discussed... how do we do these (efficiently) in
a library?

Just to be clear, it was only the types and conversions that I thoughtwould be suitable for a library. Operations, along with theiroptimisations are best for compiler.

** Literal syntax.. and constant folding:

Constants and literals also need to be aligned. If we use array syntax
to express literals, this will be a problem.

  int4 v = [ 1,2,3,4 ] + [ 5,6,7,8 ];

Any constant expressions need to be simplified at compile time: int4 vec
= [ 6,8,10,12 ];
Perhaps this is possible with CTFE? Or will it be automatic if you
express literals as if they were arrays?

You could use array syntax for vector literals, as long as they arestored directly into vector variables. e.g.


immutable int4 a = [1, 2, 3, 4];
immutable int4 b = [5, 6, 7, 8];
int4 v = a + b;

Constant folding can be done by compiler, although I don't think this isa priority.

** Expression interpretation/simplification:

  float4 v = -b + a;

Obviously, this should be simplified to 'a - b'.

  float4 v = a*b + c;

This should use a multiply-accumulate opcode on most architectures:
FMADDPS v, a, b, c

Compiler should make these decisions, just like it does with int/floatetc. In some cases these kinds of simplifications can effect the resultdue to numeric issues.

You can use expression templates for this sort of thing as well, butthey are a horrible mess, so I don't think I'd like to see them.

** Typed debug info

In a debugger it's nice to inspect variables in their supposed type.
Can probably use unions to do this... probably wouldn't be as nice though.

Good point. I'm not an expert on this, but I suspect that a union wouldbe good enough?

** God knows what other optimisations

float4 v = [ 0,0,0,0 ]; // XOR v
etc...

Again, I think you could use expression templates for this, but it's somuch simpler to leave this optimisation to the compiler.

Even if the compiler doesn't do it, it's not difficult to do it manuallywhen you really need it:


float4 v = void;
asm { pxor v, v; }

Honestly, I'm not too bothered with these types of optimisations. Aslong as the compiler does the register allocation and instructionscheduling for me, I would be 99% happy because those things are themost tedious when trying to write structured code. I can easily enoughchange (-b + a) to (b - a) if that's faster, or insert specificinstructions for generating vector constants, or do constant foldingmanually.

Of course, it would be nice if the compiler did them, but that's justicing on the cake. The meat of the problem is register allocation.

I don't know what amount of this is achievable with libraries, but
Walter seems to think this will all work much better in the language...
I'm inclined to trust his judgement.


I agree.

Re: SIMD support...

Reply via email to