On Saturday, 26 December 2015 at 19:57:19 UTC, Ilya Yaroshenko
I will write GEMM and GEMV families of BLAS for Phobos.
- code without assembler
- code based on SIMD instructions
- DMD/LDC/GDC support
- kernel based architecture like OpenBLAS
- 85-100% FLOPS comparing with OpenBLAS (100%)
- tiny generic code comparing with OpenBLAS
- ability to define user kernels
- allocators support. GEMM requires small internal allocations.
- @nogc nothrow pure template functions (depends on allocator)
- optional multithreaded
- ability to work with `Slice` multidimensional arrays when
stride between elements in vector is greater than 1. In common
BLAS matrix strides between rows or columns always equals 1.
LDC all : very generic D/LLVM IR kernels. AVX/2/512/neon
support is out of the box.
DMD/GDC x86 : kernels for 8 XMM registers based on core.simd
DMD/GDC x86_64: kernels for 16 XMM registers based on core.simd
DMD/GDC other : generic kernels without SIMD instructions.
AVX/2/512 support can be added in the future.
 Anatomy of High-Performance Matrix Multiplication:
 OpenBLAS https://github.com/xianyi/OpenBLAS
Happy New Year!
I am absolutely thrilled! I've been using scid
(https://github.com/kyllingstad/scid) and cblas
(https://github.com/DlangScience/cblas) in a project, and I can't
wait to see a smooth integration in the standard library.
Why will the functions be nothrow? It seems that if you try to
take the determinant of a 3x5 matrix, you should get an exception.
By 'tiny generic code', you mean that DGEMM, SSYMM, CTRMM, etc.
all become one function, basically?
You mention that you'll have GEMM and GEMV in your features, do
you think we'll get a more complete slice of BLAS/LAPACK in the
future, like GESVD and GEES?
If it's not in the plan, I'd be happy to work on re-tooling scid
and cblas to feel like std.blas. (That is, mimic how you choose
to represent a matrix, throw the same type of exceptions, etc.
But still use external libraries.)
Thanks again for this!