Am 15.01.2012, 11:45 Uhr, schrieb Manu <[email protected]>:

On 15 January 2012 08:16, Sean Cavanaugh <[email protected]> wrote:

On 1/15/2012 12:09 AM, Walter Bright wrote:

On 1/14/2012 9:58 PM, Sean Cavanaugh wrote:

MS has three types, __m128, __m128i and __m128d (float, int, double)

Six if you count AVX's 256 forms.

On 1/7/2012 6:54 PM, Peter Alexander wrote:

On 7/01/12 9:28 PM, Andrei Alexandrescu wrote:
I agree with Manu that we should just have a single type like __m128 in
MSVC. The other types and their conversions should be solvable in a
library with something like strong typedefs.


The trouble with MS's scheme, is given the following:

__m128i v;
v += 2;

Can't tell what to do. With D,

int4 v;
v += 2;

it's clear (add 2 to each of the 4 ints).


Working with their intrinsics in their raw form for real code is pure
insanity :) You need to wrap it all with a good math library (even if 90% of the library is the intrinsics wrapped into __forceinlined functions), so you can start having sensible operator overloads, and so you can write code
that is readable.


if (any4(a > b))
{
 // do stuff
}


is way way way better than (pseudocode)

if (__movemask_ps(_mm_gt_ps(a, b)) == 0x0F)
{
}



and (if the ternary operator was overrideable in C++)

float4 foo = (a > b) ? c : d;

would be better than

float4 mask = _mm_gt_ps(a, b);
float4 foo = _mm_or_ps(_mm_and_ps(mask, c), _mm_nand_ps_(mask, d));


Yep, it's coming... baby steps :)

Walter: I told you games devs would be all over this! :P

And even a compression algorithms. I found one written in C, that uses external .asm files to be compiled into object files with NASM for use on the linker command line. They contain some MMX/SSE code depending on the processor you plan to use. The author claims, that the MMX version of the 'outsourced' routines run 8x faster. I didn't verify this, but the idea that these instructions become part of the language and easy to use for regular programmers like me (and not just console game developers) is exciting. I bet there are more programs that could benefit from SSE than is obvious or code that could be rewritten in way, that multiple data sets can be processed simultaneous.

Reply via email to