"intel-intrinsics" is a DUB package for people interested in x86 performance that want neither to write assembly, nor a LDC-specific snippet... and still have fastest possible code.

Available through DUB: http://code.dlang.org/packages/intel-intrinsics


*** Features of v1.1.0:

- All intrinsics in this list: https://software.intel.com/sites/landingpage/IntrinsicsGuide/#techs=MMX,SSE,SSE2 Use existing Intel documentation and syntax

- write the same code for both DMD and LDC, in the last 6 versions for each. (Note that debug performance might suffer a lot when no inlining is activated.)

- Use operators on SIMD vectors as if core.simd were implemented on DMD 32-bit

- Introduces int2 and float2 because short SIMD vectors are useful

- about 6000 LOC (for now! more to come)

- Bonus: approximated pow/exp/log. Perform 4 approximated pow at once.


<future>
The long-term goal for this library is to be _only about semantics_, and not particularly codegen(!). This is because LLVM IR is portable, so forcing a particular instruction is undoing this portability work. **This can seem odd** for an "intrinsics" library but this way exact codegen options can be choosen by the library user, and most intrinsics can gracefuly degrade to portable IR in theory.

In the future, "magic" LLVM intrinsics will only be used when built for x86, but I think all of it can become portable and not x86-specific. Besides, there is a trend in LLVM to remove magic intrinsics once they are doable with IR only.
</future>


tl;dr you can use "intel-intrinsics" today, and get quite-optimal code with LDC, without duplication. You may come across early bugs too.
http://code.dlang.org/packages/intel-intrinsics

(note: it's important to bench against vanilla D code or arrays ops too, in some case the vanilla code wins)

Reply via email to