there isn't that much code in numpy that profits from modern x86
instruction sets, even the simple arithmetic loops are strided and thus
unvectorizable by the compiler. They have been vectorized manually in
1.8 using sse2 and it is on my todo list to add runtime detected avx
support.


On 26.11.2013 09:57, Daπid wrote:
> Have you tried on an Intel CPU? I have both a i5 quad core and an i7
> octo core where I could run it over the weekend. One may expect some
> compiler magic taking advantage of the advanced features, specially the i7.

> 
>     using the vbench I created a comparison of gcc and clang with different
>     options.
>     Cliffnotes:
>     * gcc -O2 performs 5-10% better than -O3 in most benchmarks, except in a
>     few select cases where the vectorizer does its magic
>     * gcc and clang are very close in performance, but the cases where a
>     compiler wins by a large margin its mostly gcc that wins
> 
>     I have collected some interesting plots on this notebook:
>     http://nbviewer.ipython.org/7646615


_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to