Re: 2016Q1: std.blas

Charles McAnany via Digitalmars-d-announce Sat, 26 Dec 2015 21:46:50 -0800

On Saturday, 26 December 2015 at 19:57:19 UTC, Ilya Yaroshenkowrote:

Hi,

I will write GEMM and GEMV families of BLAS for Phobos.
Goals:
 - code without assembler
 - code based on SIMD instructions
 - DMD/LDC/GDC support
 - kernel based architecture like OpenBLAS
 - 85-100% FLOPS comparing with OpenBLAS (100%)
 - tiny generic code comparing with OpenBLAS
 - ability to define user kernels
 - allocators support. GEMM requires small internal allocations.
 - @nogc nothrow pure template functions (depends on allocator)
 - optional multithreaded
- ability to work with `Slice` multidimensional arrays whenstride between elements in vector is greater than 1. In commonBLAS matrix strides between rows or columns always equals 1.
Implementation details:
LDC all : very generic D/LLVM IR kernels. AVX/2/512/neonsupport is out of the box.
DMD/GDC x86   : kernels for  8 XMM registers based on core.simd
DMD/GDC x86_64: kernels for 16 XMM registers based on core.simd
DMD/GDC other : generic kernels without SIMD instructions.AVX/2/512 support can be added in the future.
References:
[1] Anatomy of High-Performance Matrix Multiplication:http://www.cs.utexas.edu/users/pingali/CS378/2008sp/papers/gotoPaper.pdf
[2] OpenBLAS  https://github.com/xianyi/OpenBLAS

Happy New Year!

Ilya

I am absolutely thrilled! I've been using scid(https://github.com/kyllingstad/scid) and cblas(https://github.com/DlangScience/cblas) in a project, and I can'twait to see a smooth integration in the standard library.


Couple questions:

Why will the functions be nothrow? It seems that if you try totake the determinant of a 3x5 matrix, you should get an exception.

By 'tiny generic code', you mean that DGEMM, SSYMM, CTRMM, etc.all become one function, basically?

You mention that you'll have GEMM and GEMV in your features, doyou think we'll get a more complete slice of BLAS/LAPACK in thefuture, like GESVD and GEES?

If it's not in the plan, I'd be happy to work on re-tooling scidand cblas to feel like std.blas. (That is, mimic how you chooseto represent a matrix, throw the same type of exceptions, etc.But still use external libraries.)


Thanks again for this!

Re: 2016Q1: std.blas

Reply via email to