David Nadlinger Wrote:

> On 12/29/11 2:13 PM, a wrote:
> > void test(ref V a, ref V b)
> > {
> >      asm
> >      {
> >          movaps XMM0, a;
> >          addps  XMM0, b;
> >          movaps a, XMM0;
> >      }
> >      asm
> >      {
> >          movaps XMM0, a;
> >          addps  XMM0, b;
> >          movaps a, XMM0;
> >      }
> > }
> >
> > […]
> >
> > The needles loads and stores would make it impossible to write an efficient 
> > simd add function even if the functions containing asm blocks could be 
> > inlined.
> 
> Yes, this is indeed a problem, and as far as I'm aware, usually solved 
> in the gamedev world by using the (SSE) intrinsics your favorite C++ 
> compiler provides, instead of resorting to inline asm.
> 
> David

IIRC Walter doesn't want to add vector intrinsics, so it would be nice if the 
functions to do vector operations could be efficiently  written using inline 
assembly.  It would also be a more general solution than having intrinsics. 
Something like that is possible with gcc extended inline assembly. For example 
this: 

typedef float v4sf __attribute__((vector_size(16)));

void vadd(v4sf *a, v4sf *b)
{
    asm(
        "addps %1, %0" 
        : "=x" (*a) 
        : "x" (*b), "0" (*a)
        : );
}

void test(float * __restrict__ a, float * __restrict__ b)
{
    v4sf * va = (v4sf*) a;
    v4sf * vb = (v4sf*) b;
    vadd(va,vb);
    vadd(va,vb);
    vadd(va,vb);
    vadd(va,vb);
}

compiles to:

00000000004004c0 <test>:
  4004c0:       0f 28 0e                movaps (%rsi),%xmm1
  4004c3:       0f 28 07                movaps (%rdi),%xmm0
  4004c6:       0f 58 c1                addps  %xmm1,%xmm0
  4004c9:       0f 58 c1                addps  %xmm1,%xmm0
  4004cc:       0f 58 c1                addps  %xmm1,%xmm0
  4004cf:       0f 58 c1                addps  %xmm1,%xmm0
  4004d2:       0f 29 07                movaps %xmm0,(%rdi)

This should also be possible with GDC, but I couldn't figure out how to get 
something like __restrict__ (if you want to use vector types and gcc extended 
inline assembly with GDC, see 
http://www.digitalmars.com/d/archives/D/gnu/Support_for_gcc_vector_attributes_SIMD_builtins_3778.html
 and https://bitbucket.org/goshawk/gdc/wiki/UserDocumentation).

Reply via email to