Re: [Numpy-discussion] einsum slow vs (tensor)dot
On 25 October 2012 22:54, David Warde-Farley wrote: > On Wed, Oct 24, 2012 at 7:18 AM, George Nurser wrote: > > Hi, > > > > I was just looking at the einsum function. > > To me, it's a really elegant and clear way of doing array operations, > which > > is the core of what numpy is about. > > It removes the need to remember a range of functions, some of which I > find > > tricky (e.g. tile). > > > > Unfortunately the present implementation seems ~ 4-6x slower than dot or > > tensordot for decent size arrays. > > I suspect it is because the implementation does not use blas/lapack > calls. > > > > cheers, George Nurser. > > Hi George, > > IIRC (and I haven't dug into it heavily; not a physicist so I don't > encounter this notation often), einsum implements a superset of what > dot or tensordot (and the corresponding BLAS calls) can do. So, I > think that logic is needed to carve out the special cases in which an > einsum can be performed quickly with BLAS. > Hi David, Yes, that's my reading of the situation as well. > Pull requests in this vein would certainly be welcome, but requires > the attention of someone who really understands how einsum works/can > work. > ...and I guess how to interface w BLAS/LAPACK. cheers, George. > > David > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum slow vs (tensor)dot
On Wed, Oct 24, 2012 at 7:18 AM, George Nurser wrote: > Hi, > > I was just looking at the einsum function. > To me, it's a really elegant and clear way of doing array operations, which > is the core of what numpy is about. > It removes the need to remember a range of functions, some of which I find > tricky (e.g. tile). > > Unfortunately the present implementation seems ~ 4-6x slower than dot or > tensordot for decent size arrays. > I suspect it is because the implementation does not use blas/lapack calls. > > cheers, George Nurser. Hi George, IIRC (and I haven't dug into it heavily; not a physicist so I don't encounter this notation often), einsum implements a superset of what dot or tensordot (and the corresponding BLAS calls) can do. So, I think that logic is needed to carve out the special cases in which an einsum can be performed quickly with BLAS. Pull requests in this vein would certainly be welcome, but requires the attention of someone who really understands how einsum works/can work. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] einsum slow vs (tensor)dot
Hi, I was just looking at the einsum function. To me, it's a really elegant and clear way of doing array operations, which is the core of what numpy is about. It removes the need to remember a range of functions, some of which I find tricky (e.g. tile). Unfortunately the present implementation seems ~ 4-6x slower than dot or tensordot for decent size arrays. I suspect it is because the implementation does not use blas/lapack calls. cheers, George Nurser. E.g. (in ipython on Mac OS X 10.6, python 2.7.3, numpy 1.6.2 from macports) a = np.arange(60.).reshape(1500,400) b = np.arange(24.).reshape(400,600) c = np.arange(600) d = np.arange(400) %timeit np.einsum('ij,jk', a, b) 10 loops, best of 3: 156 ms per loop %timeit np.dot(a,b) 10 loops, best of 3: 27.4 ms per loop %timeit np.einsum('i,ij,j',d,b,c) 1000 loops, best of 3: 709 us per loop %timeit np.dot(d,np.dot(b,c)) 1 loops, best of 3: 121 us per loop or abig = np.arange(4800.).reshape(6,8,100) bbig = np.arange(1920.).reshape(8,6,40) %timeit np.einsum('ijk,jil->kl', abig, bbig) 1000 loops, best of 3: 425 us per loop %timeit np.tensordot(abig,bbig, axes=([1,0],[0,1])) 1 loops, best of 3: 105 us per loop ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion