> On 31 Jan 2016, at 9:48 am, Sebastian Berg <sebast...@sipsolutions.net> wrote: > > On Sa, 2016-01-30 at 20:27 +0100, Derek Homeier wrote: >> On 27 Jan 2016, at 1:10 pm, Sebastian Berg < >> sebast...@sipsolutions.net> wrote: >>> >>> On Mi, 2016-01-27 at 11:19 +0000, Nadav Horesh wrote: >>>> Why the dot function/method is slower than @ on python 3.5.1? >>>> Tested >>>> from the latest 1.11 maintenance branch. >>>> >>> >>> The explanation I think is that you do not have a blas >>> optimization. In >>> which case the fallback mode is probably faster in the @ case >>> (since it >>> has SSE2 optimization by using einsum, while np.dot does not do >>> that). >> >> I am a bit confused now, as A @ c is just short for A.__matmul__(c) >> or equivalent >> to np.matmul(A,c), so why would these not use the optimised blas? >> Also, I am getting almost identical results on my Mac, yet I thought >> numpy would >> by default build against the VecLib optimised BLAS. If I build >> explicitly against >> ATLAS, I am actually seeing slightly slower results. >> But I also saw these kind of warnings on the first timeit runs: >> >> %timeit A.dot(c) >> The slowest run took 6.91 times longer than the fastest. This could >> mean that an intermediate result is being cached >> >> and when testing much larger arrays, the discrepancy between matmul >> and dot rather >> increases, so perhaps this is more an issue of a less memory >> -efficient implementation >> in np.dot? > > Sorry, I missed the fact that one of the arrays was 3D. In that case I > am not even sure which if the functions call into blas or what else > they have to do, would have to check. Note that `np.dot` uses a > different type of combinging high dimensional arrays. @/matmul > broadcasts extra axes, while np.dot will do the outer combination of > them, so that the result is: > > As = A.shape > As.pop(-1) > cs = c.shape > cs.pop(-2) # if possible > result_shape = As + cs > > which happens to be identical if only A.ndim > 2 and c.ndim <= 2.
Makes sense now; with A.ndim = 2 both operations take about the same time (and are ~50% faster with VecLib than with ATLAS) and yield identical results, while any additional dimension in A adds more overhead time to np.dot, and the results are np.allclose, but not exactly identical. Thanks, Derek _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion