On 5/15/2012 9:39 AM, jerro wrote:
Note that core.simd currently only defines SSE intrinsics for
instructions of the form
INSTRUCTION xmm1, xmm2/m128
which means that instructions such as shufps are not supported.
You could take a look at gdc, which provides gcc builtins
through module gcc.builtins. To find the builtin names you can
take a look at gcc implementation of xmmintrin.h. GDC also
produces faster code than DMD, especially for floating point
code. It does not yet support AVX, though.
If you want to use AVX for operations that don't have an
operator, currently your only choice (AFAIK) is to use LDC
and an ugly workaround that I used at
https://github.com/jerro/pfft. You write your"intrinsics"
in c and use clang to compile them to .bc (or write a .ll
file manually if you know the llvm assembly language). Then
you compile your D code with LLVM using the flags -output-bc
and -single-obj. You merge the resulting .bc file with the
"intrinsics" file using llvm-link, then optimize it using
opt and convert them to assembly using llc. Here is an
example:
https://github.com/jerro/pfft/blob/master/build-ldc2.sh
I have only tried this on linux.
You can use the inline assembler for shufps, also for AVX.