The intention was that std.simd would be flat C-style api, which would be
the lowest level required for practical and portable use.

Since LDC and GDC implement intrinsics with an API different from that used in DMD, there are actually two kinds of portability we need to worry about - portability across different compilers and portability across different architectures. std.simd solves both of those problems, which is great for many use cases (for example when dealing with geometric vectors), but it doesn't help when you want to use architecture dependant functionality directly. In this case one would want to have an interface as close to the actual instructions as possible but uniform across compilers. I think we should define such an interface as functions and templates in core.simd, so you would have for example:

float4 unpcklps(float4, float4);
float4 shufps(int, int, int, int)(float4, float4);

Then each compiler would implement this API in its own way. DMD would use __simd (1), gdc would use GCC builtins and LDC would use LLVM intrinsics and shufflevector. If we don't include something like that in core.simd, many applications will need to implement their own versions of it. Using this would also reduce the amount of code needed to implement std.simd (currently most of the std.simd only supports GDC and it's already pretty large). What do you think about adding such an API to core.simd?

(1) Some way to support the rest of SSE instructions needs to be added to DMD, of course.

Reply via email to