On Saturday, 26 December 2015 at 19:57:19 UTC, Ilya Yaroshenko wrote:
Hi,

I will write GEMM and GEMV families of BLAS for Phobos.

Goals:
 - code without assembler
 - code based on SIMD instructions
 - DMD/LDC/GDC support
 - kernel based architecture like OpenBLAS
 - 85-100% FLOPS comparing with OpenBLAS (100%)
 - tiny generic code comparing with OpenBLAS
 - ability to define user kernels
 - allocators support. GEMM requires small internal allocations.
 - @nogc nothrow pure template functions (depends on allocator)
 - optional multithreaded
- ability to work with `Slice` multidimensional arrays when stride between elements in vector is greater than 1. In common BLAS matrix strides between rows or columns always equals 1.

Implementation details:
LDC all : very generic D/LLVM IR kernels. AVX/2/512/neon support is out of the box.
DMD/GDC x86   : kernels for  8 XMM registers based on core.simd
DMD/GDC x86_64: kernels for 16 XMM registers based on core.simd
DMD/GDC other : generic kernels without SIMD instructions. AVX/2/512 support can be added in the future.

References:
[1] Anatomy of High-Performance Matrix Multiplication: http://www.cs.utexas.edu/users/pingali/CS378/2008sp/papers/gotoPaper.pdf
[2] OpenBLAS  https://github.com/xianyi/OpenBLAS

Happy New Year!

Ilya

I am absolutely thrilled! I've been using scid (https://github.com/kyllingstad/scid) and cblas (https://github.com/DlangScience/cblas) in a project, and I can't wait to see a smooth integration in the standard library.

Couple questions:

Why will the functions be nothrow? It seems that if you try to take the determinant of a 3x5 matrix, you should get an exception.

By 'tiny generic code', you mean that DGEMM, SSYMM, CTRMM, etc. all become one function, basically?

You mention that you'll have GEMM and GEMV in your features, do you think we'll get a more complete slice of BLAS/LAPACK in the future, like GESVD and GEES?

If it's not in the plan, I'd be happy to work on re-tooling scid and cblas to feel like std.blas. (That is, mimic how you choose to represent a matrix, throw the same type of exceptions, etc. But still use external libraries.)

Thanks again for this!

Reply via email to