On Sun, 08 Jan 2012 18:56:04 +0100, Peter Alexander
<[email protected]> wrote:
On 8/01/12 5:02 PM, Martin Nowak wrote:
simdop will need more overloads, e.g. some
instructions need immediate bytes.
z = simdop(SHUFPS, x, y, 0);
How about this:
__v128 simdop(T...)(SIMD op, T args);
These don't make a lot of sense to return as value, e.g.
__v128 a, b;
a = simdop(movhlps, b); // ???
movhlps moves the top 64-bits of b into the bottom 64-bits of a. Can't
be done as an expression like this.
Would make more sense to just write the instructions like they appear in
asm:
simdop(movhlps, a, b);
simdop(addps, a, b);
etc.
Yeah, also thought of this. Having a copy as default would
require to eliminate them again.
The difference between this and inline asm would be:
1. Registers are automatically allocated.
See asm pseudo-registers.
2. Loads/stores are inserted when we spill to stack.
There are sequencing point before and after asm blocks.
3. Instructions can be scheduled and optimised by the compiler.
Optimization can be done on IR level.
Scheduling is done after all code is emitted.