That Cuda is used more is probably true, OpenCL is fugly C and no fun.

Microsoft's upcoming C++ AMP looks interesting as it lets you write GPU and CPU code in C++. The spec is open so hopefully it becomes common to implement it in other C++ compilers.

SSE intrinsics in C++ are pretty essential for getting great performance, so I do think D needs something like this. A problem with intrinsics in C++ has been poor support from compilers, often performing little or no optimization and just blindly issuing instructions as you listed them, causing all kinds of extra loads and stores.

Visual Studio is actually one of the worst C++ compilers for intrinsics, ICC is likely the best.

So even if D does add these new intrinsic functions it would need to actual optimize around them to produce reasonably fast code.

I agree that the v128 type should be typeless, it is typeless on hardware, and this makes it easier to mix and match instructions.

Reply via email to