On Thursday, 31 March 2016 at 08:23:45 UTC, Martin Nowak wrote:
I'm currently working on a templated arrayop implementation
(using RPN
to encode ASTs).
So far things worked out great, but now I got stuck b/c
apparently none
of the D compilers has a working SIMD implementation (maybe GDC
has but
it's very difficult to work w/ the 2.066 frontend).
https://github.com/MartinNowak/druntime/blob/arrayOps/src/core/internal/arrayop.d
https://github.com/MartinNowak/dmd/blob/arrayOps/src/arrayop.d
I don't want to do anything fancy, just unaligned loads,
stores, and integral mul/div. Is this really the current state
of SIMD or am I missing sth.?
-Martin
ndslice.algorithm [1], [2] compiled with recent LDC beta will do
all work for you. Vectorized flag should be turned on and the
last (row) dimension should have stride==1.
Generic matrix-matrix multiplication [3] is available in Mir
version 0.16.0-beta2
It should be compiled with recent LDC beta, and -mcpu=native flag.
[1] http://docs.mir.dlang.io/latest/mir_ndslice_algorithm.html
[2] https://github.com/dlang/phobos/pull/4652
[3] http://docs.mir.dlang.io/latest/mir_glas_gemm.html