Am 29.04.2014 um 02:01 schrieb Nathaniel Smith <n...@pobox.com>: > On Tue, Apr 29, 2014 at 12:52 AM, Sturla Molden <sturla.mol...@gmail.com> > wrote: >> On 29/04/14 01:30, Nathaniel Smith wrote: >> >>> I finally read this paper: >>> >>> http://www.cs.utexas.edu/users/flame/pubs/blis2_toms_rev2.pdf >>> >>> and I have to say that I'm no longer so convinced that OpenBLAS is the >>> right starting point. >> >> I think OpenBLAS in the long run is doomed as an OSS project. Having >> huge portions of the source in assembly is not sustainable in 2014. >> OpenBLAS (like GotoBLAS2 before it) runs a high risk of becoming >> abandonware. > > Have you read the paper I linked? I really recommend it. BLIS is > apparently 95% straight-up-C, plus a slot where you stick in a tiny > CPU-specific super-optimized kernel [1]. So this localizes the nasty > stuff to one tiny function, plus most of the kernels that have been > written so far do in fact use intrinsics [2]. > > [1] https://code.google.com/p/blis/wiki/KernelsHowTo > [2] https://code.google.com/p/blis/wiki/HardwareSupport >
I was teaching this summer an undergraduate class „Software Basics on HPC“. Of course on topic was the efficient implementation of the matrix-matrix product GEMM. The BLIS paper [1] is a great source for that. In my opinion having your own hands-on experience is very important for actually understanding this concepts. That in particular means that we implemented our own matrix-matrix product. The pure C (ANSI C) implementation has less than 450 lines of code. The code consists of several function and students developed these functions one by one from one assignment to the other. You can see the result here: http://apfel.mathematik.uni-ulm.de/~lehn/sghpc/gemm/page02/index.html#toc4 Other assignments where about improving the micro kernel with SSE instructions. You can travers through the pages to see how we where doing so step by step. Please understand that this course material is still work in progress and needs some polish here and there. Still it could be useful for others and even a starting point for a simple BLAS implementation. Cheers, Michael [1]: http://www.cs.utexas.edu/users/flame/pubs/BLISTOMSrev2.pdf ----------------------------------------------------------------------------------- Dr. Michael Lehn University of Ulm, Institute for Numerical Mathematics Helmholtzstr. 20 D-89069 Ulm, Germany Phone: (+49) 731 50-23534, Fax: (+49) 731 50-23548 ----------------------------------------------------------------------------------- _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion