On Monday, 18 April 2016 at 00:27:06 UTC, Joe Duarte wrote:
On Tuesday, 5 April 2016 at 10:27:46 UTC, Walter Bright wrote:
Besides, I think it's a poor design to customize the app for only one SIMD type. A better idea (I've repeated this ad nauseum over the years) is to have n modules, one for each supported SIMD type. Compile and link all of them in, then detect the SIMD type at runtime and call the corresponding module. (This is how the D array ops are currently implemented.)

There are many organizations in the world that are building software in-house, where such software is targeted to modern CPU SIMD types, most typically AVX/AVX2 and crypto instructions.


In addition it's COMPILER work, not programmer!
Compiler SHOULD be able to vectorize the code using SSE/AVX depending on command line switch. Why i should write all these merde ? Let compiler do its work.

Also compiler CAN generate multiple versions of one function using different SIMD instructions : Intel C++ Compiler works this way : it generates a few versions of a function and checks at run-time CPU capabilities and executes the fastest one.

Reply via email to