On Wed, Dec 20, 2017 at 07:15:16PM +0100, Ricardo Wurmus wrote:
> Is it really single-threaded?  I remember having a couple of problems
> with OpenBLAS on our cluster when it is used with Numpy as both would
> spawn lots of threads.  The solution was to limit OpenBLAS to at most
> two threads.

Looks like 1 on my system.

> > If I compile for a target it
> > makes a large difference.
> 
> The FAQ document[1] says this:
> 
>   The environment variable which control the kernel selection is
>   OPENBLAS_CORETYPE (see driver/others/dynamic.c) e.g. export
>   OPENBLAS_CORETYPE=Haswell. And the function char*
>   openblas_get_corename() returns the used target.
> 
> [1]: https://github.com/xianyi/OpenBLAS/wiki/Faq
> 
> Have you tried this and compared the performance?

About 10x difference on 24+ cores for matrix multiplication (my
version vs what comes with Guix).

I do think we need to default to a conservative openblas for general
use. Question is how we make it fly on dedicated hardware.

package python-numpy:openblas-haswellp 

for the parallel version?

also for R and others. Problem is that we blow up the types of
packages.

Pj.


Reply via email to