It seems to work fine on 0.4. On my dual core i5:

julia> peakflops()
6.3990880531633675e10

julia> blas_set_num_threads(1)

julia> peakflops()
3.2582507660855206e10

-viral

On Friday, March 6, 2015 at 10:27:46 PM UTC+5:30, Steven G. Johnson wrote:
>
> For my numerics class at MIT <http://math.mit.edu/~stevenj/18.335/>, I 
> used the following notebook to talk about cache effects and matrix 
> multiplication:
>
>     
> http://nbviewer.ipython.org/url/math.mit.edu/~stevenj/18.335/Matrix-multiplication-experiments.ipynb
>
> It includes some code to benchmark the built-in BLAS-based multiplication 
> against some simpler algorithms, and for comparison purposes I used 
> blas_set_num_threads(1) to benchmark only serial performance... I thought.
>
> When I ran the benchmark on my desktop, the results made sense: OpenBLAS 
> got about 3 * 4 Gflops, which is peak performance for a 3GHz CPU that can 
> perform 4 flops per cycle (via 256-bit AVX instructions).   However, on my 
> laptop, it got about 40 gigaflops, which only makes sense if it was using 
> additional cores.  In both cases, this was with Julia 0.4 using OpenBLAS.
>
> Is there any reason why blas_set_num_threads(1) would not be sufficient to 
> disable additional cores?
>
> --SGJ
>

Reply via email to