The OpenBLAS framework is described in
http://dl.acm.org/citation.cfm?id=2503219 and builds on top of GotoBLAS,
http://dl.acm.org/citation.cfm?id=1377607

Jutho: Part of the conclusion from your link is that you cannot write a
fast matrix multiplication in C, but that you'll have to write it in
assembly to avoid that a compiler destroys the performance.

2015-07-08 11:50 GMT-04:00 Jutho <[email protected]>:

> I found a rather instructive tutorial about the kind of optimisations
> going into matrix multiplication in BLIS (not OpenBLAS but related) here:
> http://apfel.mathematik.uni-ulm.de/%7Elehn/sghpc/gemm/index.html
>
> It's not something one can implement in Julia (yet). Hopefully further
> work in the direction of vectorisation of tuples will help (e.g. issue
> #11899  and related)...

Reply via email to