Am 29.04.2014 um 02:01 schrieb Nathaniel Smith <n...@pobox.com>:

> On Tue, Apr 29, 2014 at 12:52 AM, Sturla Molden <sturla.mol...@gmail.com> 
> wrote:
>> On 29/04/14 01:30, Nathaniel Smith wrote:
>> 
>>> I finally read this paper:
>>> 
>>>    http://www.cs.utexas.edu/users/flame/pubs/blis2_toms_rev2.pdf
>>> 
>>> and I have to say that I'm no longer so convinced that OpenBLAS is the
>>> right starting point.
>> 
>> I think OpenBLAS in the long run is doomed as an OSS project. Having
>> huge portions of the source in assembly is not sustainable in 2014.
>> OpenBLAS (like GotoBLAS2 before it) runs a high risk of becoming
>> abandonware.
> 
> Have you read the paper I linked? I really recommend it. BLIS is
> apparently 95% straight-up-C, plus a slot where you stick in a tiny
> CPU-specific super-optimized kernel [1]. So this localizes the nasty
> stuff to one tiny function, plus most of the kernels that have been
> written so far do in fact use intrinsics [2].
> 
> [1] https://code.google.com/p/blis/wiki/KernelsHowTo
> [2] https://code.google.com/p/blis/wiki/HardwareSupport
> 

I was teaching this summer an undergraduate class „Software Basics on HPC“.  Of 
course on topic
was the efficient implementation of the matrix-matrix product GEMM.  The BLIS 
paper [1] is a great
source for that.

In my opinion having your own hands-on experience is very important for 
actually understanding this
concepts.  That in particular means that we implemented our own matrix-matrix 
product.  The pure C
(ANSI C) implementation has less than 450 lines of code.  The code consists of 
several function and
students developed these functions one by one from one assignment to the other. 
 You can see the
result here:

        
http://apfel.mathematik.uni-ulm.de/~lehn/sghpc/gemm/page02/index.html#toc4

Other assignments where about improving the micro kernel with SSE instructions. 
 You can travers
through the pages to see how we where doing so step by step.

Please understand that this course material is still work in progress and needs 
some polish here and
there.  Still it could be useful for others and even a starting point for a 
simple BLAS implementation.

Cheers,

Michael


[1]: http://www.cs.utexas.edu/users/flame/pubs/BLISTOMSrev2.pdf


-----------------------------------------------------------------------------------
Dr. Michael Lehn
University of Ulm, Institute for Numerical Mathematics
Helmholtzstr. 20
D-89069 Ulm, Germany
Phone: (+49) 731 50-23534, Fax: (+49) 731 50-23548
-----------------------------------------------------------------------------------
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to