Re: 2016Q1: std.blas

Ilya Yaroshenko via Digitalmars-d-announce Sun, 27 Dec 2015 08:15:37 -0800

On Sunday, 27 December 2015 at 10:28:53 UTC, Russel Winder wrote:

On Sat, 2015-12-26 at 19:57 +0000, Ilya Yaroshenko viaDigitalmars-d- announce wrote:
Hi,
I will write GEMM and GEMV families of BLAS for Phobos.

Goals:
  - code without assembler
  - code based on SIMD instructions
  - DMD/LDC/GDC support
  - kernel based architecture like OpenBLAS
  - 85-100% FLOPS comparing with OpenBLAS (100%)
  - tiny generic code comparing with OpenBLAS
  - ability to define user kernels
- allocators support. GEMM requires small internalallocations. - @nogc nothrow pure template functions (depends onallocator)
  - optional multithreaded
  - ability to work with `Slice` multidimensional arrays when
stride between elements in vector is greater than 1. In common
BLAS matrix strides between rows or columns always equals 1.
Shouldn't to goal of a project like this be to be somethingthat OpenBLAS isn't? Given D's ability to call C and C++ code,it is not clear to me that simply rewriting OpenBLAS in D hasany goal for the D or BLAS communities per se. Doesn't stop itbeing a fun activity for the programmer, obviously, but unlessthere is something that isn't in OpenBLAS, I cannot see thisever being competition and so building a community around theproject.

It depends on what you mean with "something like this". OpenBLASis _huge_ amount of assembler code. For _each_ platform for_each_ CPU generation for _each_ floating point / complex type itwould have a kernel or few kernels. It is 30 MB of assembler code.

Not only D code can call C/C++, but also C/C++ (and so any otherlanguage) can call D code. So std.blas may be used in C/C++projects like Julia.

Now if the threads/OpenCL/CUDA was front and centre so that agoal was to be Nx faster than OpenBLAS, that could be a goalworth standing behind.

It can be goal for standalone project. But standard libraryshould be portable on any platform without significant problems(especially without problems caused by matrix multiplication). Somy goal is tiny and portable project like ATLAS, but fast likeOpenBLAS. BTW, threads in std.blas would be optional like inOpenBLAS. Futhermore std.blas will allow a user to write his ownkernels.

Not to mention full N-dimension vectors so that D couldseriously compete against Numpy in the Python world.

I am not sure how D can compete against Numpy in the Pythonworld, but it can compete Python in world of programminglanguages. BTW, N-dimension ranges/arrays/vectors alreadyimplemented for Phobos:


PR:
https://github.com/D-Programming-Language/phobos/pull/3397

Updated Docs:
http://dtest.thecybershadow.net/artifact/website-76234ca0eab431527327d5ce1ec0ad74c6421533-fedfc857090c1c873b17e7a1e4cf853c/web/phobos-prerelease/std_experimental_ndslice.html

Please participate in voting (time constraints is extended) :-)http://forum.dlang.org/thread/[email protected]


Ilya

Re: 2016Q1: std.blas

Reply via email to