dsimcha wrote:
== Quote from Don ([email protected])'s article
Jeremie Pelletier wrote:
While writing SSE assembly by hand in D is fun and works well, I'm wondering
if the compiler has intrinsics for its instruction set, much like xmmintrin.h
in C.
The reason is that the compiler can usually reorder the intrinsics to optimize
performance.
I could always use C code to implement my SSE routines but then I'd lose the
ability to inline them in D.
I know this is an old post, but since it wasn't answered...
Make sure you know what the SSE intrinsics actually *do* in VC++/Intel!
I've read many complaints about how poorly they perform on all compilers
-- the penalty for allowing them to be reordered is that extra
instructions are often added, which means that straightforward C code is
sometimes faster!
In this regard, I'm personally excited about array operations. I think
the need for SSE intrinsics and vectorisation is a result of abstract
inversion: the instruction set is higher-level than the "high level
language"! Array operations allow D to catch up with asm again. When
array operations get implemented properly, it'll be interesting to see
how much need for SSE intrinsics remains.
What's wrong with the current implementation of array ops (other than a few
misc.
bugs that have already been filed)? I thought they already use SSE if
available.
(1) They don't take advantage of fixed-length arrays. In particular,
operations on float[4] should be a single SSE instruction (no function
call, no loop, nothing). This will make a huge difference to game and
graphics programmers, I believe.
(2) The operations don't block on cache size.
(3) DMD doesn't allow you to generate code assuming a minimum CPU
capabilities. (In fact, when generating inline asm, the CPU type is
8086! (this is in bugzilla)) This limits the possible use of (1).
It's issue (1) which is the killer.