I can see your reasoning, but I think that should be in core.sse, or core.simd.sse personally. Or you'll end up with VMX, NEON, etc all blobbed
in one huge intrinsic wrapper file.

I would be okay with core.simd.sse or core.sse.

That said, almost all simd opcodes are directly accessible in std.simd. There are relatively few obscure operations that don't have a representing
function.
The unpck/shuf example above for instance, they both effectively perform a
sort of swizzle, and both are accessible through swizzle!().

They aren't. Swizzle only takes one argument, so you cant use it to select elements from two vectors. Both unpcklps and shufps take two arguments. Writing a swizzle with two arguments would be much harder.

The swizzle
mask is analysed by the template, and it produces the best opcode to match the pattern. Take a look at swizzle, it's bloody complicated to do that the
most efficient way on x86.

Now imagine how complicated it would be to write a swizzle with to vector arguments.

The reason I didn't write the DMD support yet is because it was incomplete, and many opcodes weren't yet accessible, like shuf for instance... and I just wasn't finished. Stopped to wait for DMD to be feature complete. I'm not opposed to this idea, although I do have a concern that, because there's no __forceinline in D (or macros), adding another layer of abstraction will make maths code REALLY slow in unoptimised builds. Can you suggest a method where these would be treated as C macros, and not
produce additional layers of function calls?

Unfortunately I can't, at least not a clean one. Using string mixins would be one way but I think no one wants that kind of API in Druntime or Phobos.

I'm already unhappy that
std.simd produces redundant function calls.


<rant> please  please please can haz __forceinline! </rant>

I agree that we need that.

Reply via email to