On 29 October 2015 at 20:25, Julian Taylor <jtaylor.deb...@googlemail.com> wrote:
> should be possible by putting this into: ~/.numpy-site.cfg > > [openblas] > libraries = openblasp > > LD_PRELOAD the file should also work. > > Thank! I did some timings on a dot product of a square matrix of size 10000 with LD_PRELOADing the different versions. I checked that all the cores were crunching when an other than plain libopenblas/64 was selected. Here are the timings in seconds: Intel i5-3317U: /usr/lib64/libopenblaso.so 86.3651878834 /usr/lib64/libopenblasp64.so 96.8817200661 /usr/lib64/libopenblas.so 114.60265708 /usr/lib64/libopenblasp.so 107.927740097 /usr/lib64/libopenblaso64.so 97.5418870449 /usr/lib64/libopenblas64.so 109.000799179 Intel i7-4770: /usr/lib64/libopenblas.so 37.9794859886 /usr/lib64/libopenblasp.so 12.3455951214 /usr/lib64/libopenblas64.so 38.0571939945 /usr/lib64/libopenblasp64.so 12.5558650494 /usr/lib64/libopenblaso64.so 12.4118559361 /usr/lib64/libopenblaso.so 13.4787950516 Both computers have the same software and OS. So, it seems that openblas doesn't get a significant advantage from going parallel in the older i5; the i7 using all its cores (4 + 4 hyperthread) gains a 3x speed up, and there is no big different between OpenMP and pthreads. I am particullary puzzled by the i5 results, shouldn't threads get a noticeable speedup? /David.
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion