I tried using this inner1d as an alternative to dot because it uses broadcasting. However, I found something surprising: Not only is inner1d much much slower than dot, it is also slower than einsum which is much more general:
In [68]: import numpy as np In [69]: import numpy.core.gufuncs_linalg as gula In [70]: K = np.random.randn(1000,1000) In [71]: %timeit gula.inner1d(K[:,np.newaxis,:], np.swapaxes(K,-1,-2)[np.newaxis,:,:]) 1 loops, best of 3: 6.05 s per loop In [72]: %timeit np.dot(K,K) 1 loops, best of 3: 392 ms per loop In [73]: %timeit np.einsum('ik,kj->ij', K, K) 1 loops, best of 3: 1.24 s per loop Why is it so? I thought that the performance of inner1d would be somewhere in between dot and einsum, probably closer to dot. Now I don't see any reason to use inner1d instead of einsum.. -Jaakko On 03/15/2013 04:22 PM, Oscar Villellas wrote: > In fact, there is already an inner1d implemented in > numpy.core.umath_tests.inner1d > > from numpy.core.umath_tests import inner1d > > It should do the trick :) > > On Thu, Mar 14, 2013 at 12:54 PM, Jaakko Luttinen > <jaakko.lutti...@aalto.fi> wrote: >> Answering to myself, this pull request seems to implement an inner >> product with broadcasting (inner1d) and many other useful functions: >> https://github.com/numpy/numpy/pull/2954/ >> -J >> >> On 03/13/2013 04:21 PM, Jaakko Luttinen wrote: >>> Hi! >>> >>> How can I compute dot product (or similar multiply&sum operations) >>> efficiently so that broadcasting is utilized? >>> For multi-dimensional arrays, NumPy's inner and dot functions do not >>> match the leading axes and use broadcasting, but instead the result has >>> first the leading axes of the first input array and then the leading >>> axes of the second input array. >>> >>> For instance, I would like to compute the following inner-product: >>> np.sum(A*B, axis=-1) >>> >>> But numpy.inner gives: >>> A = np.random.randn(2,3,4) >>> B = np.random.randn(3,4) >>> np.inner(A,B).shape >>> # -> (2, 3, 3) instead of (2, 3) >>> >>> Similarly for dot product, I would like to compute for instance: >>> np.sum(A[...,:,:,np.newaxis]*B[...,np.newaxis,:,:], axis=-2) >>> >>> But numpy.dot gives: >>> In [12]: A = np.random.randn(2,3,4); B = np.random.randn(2,4,5) >>> In [13]: np.dot(A,B).shape >>> # -> (2, 3, 2, 5) instead of (2, 3, 5) >>> >>> I could use einsum for these operations, but I'm not sure whether that's >>> as efficient as using some BLAS-supported(?) dot products. >>> >>> I couldn't find any function which could perform this kind of >>> operations. NumPy's functions seem to either flatten the input arrays >>> (vdot, outer) or just use the axes of the input arrays separately (dot, >>> inner, tensordot). >>> >>> Any help? >>> >>> Best regards, >>> Jaakko >>> _______________________________________________ >>> NumPy-Discussion mailing list >>> NumPy-Discussion@scipy.org >>> http://mail.scipy.org/mailman/listinfo/numpy-discussion >>> >> >> _______________________________________________ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion