I have found that I get better performance from some openblas routines by setting the number of blas threads to the number of physical CPU cores (half the number returned by CPU_CORES when hyperthreading is enabled):
Base.blas_set_num_threads(div(CPU_CORES,2)) --Peter On Thursday, September 18, 2014 3:09:17 PM UTC-7, Stephan Buchert wrote: > > Thanks for the tips. I have now compiled julia on my laptop, and the > results are: > > julia> versioninfo() > Julia Version 0.3.0+6 > Commit 7681878* (2014-08-20 20:43 UTC) > Platform Info: > System: Linux (x86_64-redhat-linux) > CPU: Intel(R) Core(TM) i7-4700MQ CPU @ 2.40GHz > WORD_SIZE: 64 > BLAS: libopenblas (USE64BITINT DYNAMIC_ARCH NO_AFFINITY Haswell) > LAPACK: libopenblas > LIBM: libopenlibm > LLVM: libLLVM-3.3 > > julia> include("code/julia/bench.jl") > LU decomposition, elapsed time: 0.123349203 seconds > FFT , elapsed time: 0.20440579 seconds > > Matlab r2104a, with [L,U,P] = lu(A); instead of just lu(A); > LU decomposition, elapsed time: 0.0586 seconds > FFT elapsed time: 0.0809 seconds > > So a great improvement, but julia seems still 2-3 times slower than > matlab, the underlying linear algebra libraries, respectively, and for > these two very limited bench marks. Perhaps Matlab found a way to speed > their lin.alg. up recently? > > The Fedora precompiled openblas was installed already at the first test > (and presumably used by julia), but, as Andreas has also pointed out, it > seems to be significantly slower than an openblas library compiled now with > the julia installation. > >