I can see your reasoning, but I think that should be in
core.sse, or
core.simd.sse personally. Or you'll end up with VMX, NEON, etc
all blobbed
in one huge intrinsic wrapper file.
I would be okay with core.simd.sse or core.sse.
That said, almost all simd opcodes are directly accessible in
std.simd.
There are relatively few obscure operations that don't have a
representing
function.
The unpck/shuf example above for instance, they both
effectively perform a
sort of swizzle, and both are accessible through swizzle!().
They aren't. Swizzle only takes one argument, so you cant use it
to select elements from two vectors. Both unpcklps and shufps
take two arguments. Writing a swizzle with two arguments would be
much harder.
The swizzle
mask is analysed by the template, and it produces the best
opcode to match
the pattern. Take a look at swizzle, it's bloody complicated to
do that the
most efficient way on x86.
Now imagine how complicated it would be to write a swizzle with
to vector arguments.
The reason I didn't write the DMD support yet is because it was
incomplete,
and many opcodes weren't yet accessible, like shuf for
instance... and I
just wasn't finished. Stopped to wait for DMD to be feature
complete.
I'm not opposed to this idea, although I do have a concern
that, because
there's no __forceinline in D (or macros), adding another layer
of
abstraction will make maths code REALLY slow in unoptimised
builds.
Can you suggest a method where these would be treated as C
macros, and not
produce additional layers of function calls?
Unfortunately I can't, at least not a clean one. Using string
mixins would be one way but I think no one wants that kind of API
in Druntime or Phobos.
I'm already unhappy that
std.simd produces redundant function calls.
<rant> please please please can haz __forceinline! </rant>
I agree that we need that.