Re: [Numpy-discussion] einsum and broadcasting
On Thu, 2013-04-04 at 16:56 +0300, Jaakko Luttinen wrote: I don't quite understand how einsum handles broadcasting. I get the following error, but I don't understand why: In [8]: import numpy as np In [9]: A = np.arange(12).reshape((4,3)) In [10]: B = np.arange(6).reshape((3,2)) In [11]: np.einsum('ik,k...-i...', A, B) --- ValueError: operand 0 did not have enough dimensions to match the broadcasting, and couldn't be extended because einstein sum subscripts were specified at both the start and end However, if I use explicit indexing, it works: In [12]: np.einsum('ik,kj-ij', A, B) Out[12]: array([[10, 13], [28, 40], [46, 67], [64, 94]]) It seems that it also works if I add '...' to the first operand: In [12]: np.einsum('ik...,k...-i...', A, B) Out[12]: array([[10, 13], [28, 40], [46, 67], [64, 94]]) However, as far as I understand, the syntax np.einsum('ik,k...-i...', A, B) should work. Have I misunderstood something or is there a bug? My guess is, it is by design because the purpose of the ellipsis is more to allow extra dimensions that are not important to the problem itself. A vector product is np.einsum('i,i-i') and if I write np.einsum('...i,...i-...i') I allow generalizing that arrays of 1-d arrays (like the new gufunc linalg stuff). I did not check the code though, so maybe thats not the case. But in any case I don't see a reason why it should not be possible to only allow extra dims for some inputs (I guess it can also make sense to not give ... for the output). So I would say, if you want to generalize it, go ahead ;). - Sebastian Thanks for your help! Jaakko ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] einsum and broadcasting
I don't quite understand how einsum handles broadcasting. I get the following error, but I don't understand why: In [8]: import numpy as np In [9]: A = np.arange(12).reshape((4,3)) In [10]: B = np.arange(6).reshape((3,2)) In [11]: np.einsum('ik,k...-i...', A, B) --- ValueError: operand 0 did not have enough dimensions to match the broadcasting, and couldn't be extended because einstein sum subscripts were specified at both the start and end However, if I use explicit indexing, it works: In [12]: np.einsum('ik,kj-ij', A, B) Out[12]: array([[10, 13], [28, 40], [46, 67], [64, 94]]) It seems that it also works if I add '...' to the first operand: In [12]: np.einsum('ik...,k...-i...', A, B) Out[12]: array([[10, 13], [28, 40], [46, 67], [64, 94]]) However, as far as I understand, the syntax np.einsum('ik,k...-i...', A, B) should work. Have I misunderstood something or is there a bug? Thanks for your help! Jaakko ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum slow vs (tensor)dot
On 25 October 2012 22:54, David Warde-Farley warde...@iro.umontreal.cawrote: On Wed, Oct 24, 2012 at 7:18 AM, George Nurser gnur...@gmail.com wrote: Hi, I was just looking at the einsum function. To me, it's a really elegant and clear way of doing array operations, which is the core of what numpy is about. It removes the need to remember a range of functions, some of which I find tricky (e.g. tile). Unfortunately the present implementation seems ~ 4-6x slower than dot or tensordot for decent size arrays. I suspect it is because the implementation does not use blas/lapack calls. cheers, George Nurser. Hi George, IIRC (and I haven't dug into it heavily; not a physicist so I don't encounter this notation often), einsum implements a superset of what dot or tensordot (and the corresponding BLAS calls) can do. So, I think that logic is needed to carve out the special cases in which an einsum can be performed quickly with BLAS. Hi David, Yes, that's my reading of the situation as well. Pull requests in this vein would certainly be welcome, but requires the attention of someone who really understands how einsum works/can work. ...and I guess how to interface w BLAS/LAPACK. cheers, George. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum slow vs (tensor)dot
On Wed, Oct 24, 2012 at 7:18 AM, George Nurser gnur...@gmail.com wrote: Hi, I was just looking at the einsum function. To me, it's a really elegant and clear way of doing array operations, which is the core of what numpy is about. It removes the need to remember a range of functions, some of which I find tricky (e.g. tile). Unfortunately the present implementation seems ~ 4-6x slower than dot or tensordot for decent size arrays. I suspect it is because the implementation does not use blas/lapack calls. cheers, George Nurser. Hi George, IIRC (and I haven't dug into it heavily; not a physicist so I don't encounter this notation often), einsum implements a superset of what dot or tensordot (and the corresponding BLAS calls) can do. So, I think that logic is needed to carve out the special cases in which an einsum can be performed quickly with BLAS. Pull requests in this vein would certainly be welcome, but requires the attention of someone who really understands how einsum works/can work. David ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] einsum slow vs (tensor)dot
Hi, I was just looking at the einsum function. To me, it's a really elegant and clear way of doing array operations, which is the core of what numpy is about. It removes the need to remember a range of functions, some of which I find tricky (e.g. tile). Unfortunately the present implementation seems ~ 4-6x slower than dot or tensordot for decent size arrays. I suspect it is because the implementation does not use blas/lapack calls. cheers, George Nurser. E.g. (in ipython on Mac OS X 10.6, python 2.7.3, numpy 1.6.2 from macports) a = np.arange(60.).reshape(1500,400) b = np.arange(24.).reshape(400,600) c = np.arange(600) d = np.arange(400) %timeit np.einsum('ij,jk', a, b) 10 loops, best of 3: 156 ms per loop %timeit np.dot(a,b) 10 loops, best of 3: 27.4 ms per loop %timeit np.einsum('i,ij,j',d,b,c) 1000 loops, best of 3: 709 us per loop %timeit np.dot(d,np.dot(b,c)) 1 loops, best of 3: 121 us per loop or abig = np.arange(4800.).reshape(6,8,100) bbig = np.arange(1920.).reshape(8,6,40) %timeit np.einsum('ijk,jil-kl', abig, bbig) 1000 loops, best of 3: 425 us per loop %timeit np.tensordot(abig,bbig, axes=([1,0],[0,1])) 1 loops, best of 3: 105 us per loop ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] einsum evaluation order
Dear all, I was just looking into numpy.einsum and encountered an issue which might be worth pointing out in the documentation. Let us say you wish to evaluate something like this (repeated indices a summed) D[alpha, alphaprime] = A[alpha, beta, sigma] * B[alphaprime, betaprime, sigma] * C[beta, betaprime] with einsum as einsum('abs,cds,bd-ac', A, B, C) then it is not exactly clear which order einsum evaluates the contractions (or if it does it in one go). This can be very important since you can do it in several ways, one of which has the least computational complexity. The most efficient way of doing it is to contract e.g. A and C and then contract that with B (or exchange A and B). A quick test on my labtop says 2.6 s with einsum and 0.13 s for two tensordots with A and B being D x D x 2 and C is D x D for D = 96. This scaling seems to explode for higher dimensions, whereas it is much better with the two independent contractions (i believe it should be O(D^3)).For D = 512 I could do it in 5 s with two contractions, whereas I stopped waiting after 60 s for einsum (i guess einsum probably is O(D^4) in this case). I had in fact thought of making a function similar to einsum for a while, but after I noticed it dropped it. I think, however, that there might still be room for a tool for evaluating more complicated expressions efficiently. I think the best way would be for the user to enter an expression like the one above which is then evaluated in the optimal order. I know how to do this (theoretically) if all the repeated indices only occur twice (like the expression above), but for the more general expression supported by einsum I om not sure how to do it (haven't thought about it). Here I am thinking about stuff like x[i] = a[i] * b[i] and their more general counterparts (at first glance this seems to be a simpler problem than full contractions). Do you think there is a need/interest for this kind of thing? In that case I would like the write it / help write it. Much of it, I think, can be reduced to decomposing the expression into existing numpy operations (e.g. tensordot). How to incorporate issues of storage layout etc, however, I have no idea. In any case I think it might be nice to write explicitly how the expression in einsum is evaluated in the docs. Søren Gammelmark PhD-student Department of Physics and Astronomy Aarhus University ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum evaluation order
On Tue, Jan 24, 2012 at 6:32 AM, Søren Gammelmark gammelm...@gmail.comwrote: Dear all, I was just looking into numpy.einsum and encountered an issue which might be worth pointing out in the documentation. Let us say you wish to evaluate something like this (repeated indices a summed) D[alpha, alphaprime] = A[alpha, beta, sigma] * B[alphaprime, betaprime, sigma] * C[beta, betaprime] with einsum as einsum('abs,cds,bd-ac', A, B, C) then it is not exactly clear which order einsum evaluates the contractions (or if it does it in one go). This can be very important since you can do it in several ways, one of which has the least computational complexity. The most efficient way of doing it is to contract e.g. A and C and then contract that with B (or exchange A and B). A quick test on my labtop says 2.6 s with einsum and 0.13 s for two tensordots with A and B being D x D x 2 and C is D x D for D = 96. This scaling seems to explode for higher dimensions, whereas it is much better with the two independent contractions (i believe it should be O(D^3)).For D = 512 I could do it in 5 s with two contractions, whereas I stopped waiting after 60 s for einsum (i guess einsum probably is O(D^4) in this case). You are correct, einsum presently uses the most naive evaluation. I had in fact thought of making a function similar to einsum for a while, but after I noticed it dropped it. I think, however, that there might still be room for a tool for evaluating more complicated expressions efficiently. I think the best way would be for the user to enter an expression like the one above which is then evaluated in the optimal order. I know how to do this (theoretically) if all the repeated indices only occur twice (like the expression above), but for the more general expression supported by einsum I om not sure how to do it (haven't thought about it). Here I am thinking about stuff like x[i] = a[i] * b[i] and their more general counterparts (at first glance this seems to be a simpler problem than full contractions). Do you think there is a need/interest for this kind of thing? In that case I would like the write it / help write it. Much of it, I think, can be reduced to decomposing the expression into existing numpy operations (e.g. tensordot). How to incorporate issues of storage layout etc, however, I have no idea. I think a good approach would be to modify einsum so it decomposes the expression into multiple products. It may even just be a simple dynamic programming problem, but I haven't given it much thought. In any case I think it might be nice to write explicitly how the expression in einsum is evaluated in the docs. That's a good idea, yes. Thanks, Mark Søren Gammelmark PhD-student Department of Physics and Astronomy Aarhus University ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] einsum performance with many operands
Hi, I was playing around with einsum (which is really cool, by the way!), and I noticed something strange (or unexpected at least) performance-wise. Here is an example: Let x and y be two NxM arrays, and W be a NxN array. I want to compute f = x^{ik} W_i^j y_{jk} (I hope the notation is clear enough) What surprises me is that it's much faster to compute the two products separately - as in ((xW)y) or (x(Wy)), rather than directly passing the three-operands expression to einsum. I did a quick benchmark, with the following script # import numpy as np def f1(x,W,y): return np.einsum('ik,ij,jk', x,W,y) def f2(x,W,y): return np.einsum('jk,jk', np.einsum('ik,ij-jk', x, W), y) n = 30 m = 150 x=np.random.random(size=(n, m)) y=np.random.random(size=(n, m)) W=np.random.random(size=(n, n)) setup = \ from __main__ import f1,f2,x,y,W if __name__ == '__main__': import timeit min_f1_time = min(timeit.repeat(stmt='f1(x,W,y)', setup=setup, repeat=10, number=1)) min_f2_time = min(timeit.repeat(stmt='f2(x,W,y)', setup=setup, repeat=10, number=1)) print('f1: {time}'.format(time=min_f1_time)) print('f2: {time}'.format(time=min_f2_time)) # and I get (on a particular trial on my intel 64 bit machine running linux and numpy 1.6.1) the following: f1: 2.86719584465 f2: 0.891730070114 Of course, I know nothing of einsum's internals and there's probably a good reason for this. But still, it's quite counterintuitive for me; I just thought I'd mention it because a quick search didn't bring up anything on this topic, and AFAIK einsum's docs/examples don't cover the case with more than two operands. Anyway: I hope I've been clear enough, and that I didn't bring up some already-debated topic. Thanks in advance for any clarification, Eugenio ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] einsum
I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing. In testing it, it is also faster than many of NumPy's built-in functions, except for dot and inner. At the bottom of this email you can find the documentation blurb I wrote for it, and here are some timings: In [1]: import numpy as np In [2]: a = np.arange(25).reshape(5,5) In [3]: timeit np.einsum('ii', a) 10 loops, best of 3: 3.45 us per loop In [4]: timeit np.trace(a) 10 loops, best of 3: 9.8 us per loop In [5]: timeit np.einsum('ii-i', a) 100 loops, best of 3: 1.19 us per loop In [6]: timeit np.diag(a) 10 loops, best of 3: 7 us per loop In [7]: b = np.arange(30).reshape(5,6) In [8]: timeit np.einsum('ij,jk', a, b) 1 loops, best of 3: 11.4 us per loop In [9]: timeit np.dot(a, b) 10 loops, best of 3: 2.8 us per loop In [10]: a = np.arange(1.) In [11]: timeit np.einsum('i-', a) 1 loops, best of 3: 22.1 us per loop In [12]: timeit np.sum(a) 1 loops, best of 3: 25.5 us per loop -Mark The documentation: einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe') Evaluates the Einstein summation convention on the operands. Using the Einstein summation convention, many common multi-dimensional array operations can be represented in a simple fashion. This function provides a way compute such summations. The best way to understand this function is to try the examples below, which show how many common NumPy functions can be implemented as calls to einsum. The subscripts string is a comma-separated list of subscript labels, where each label refers to a dimension of the corresponding operand. Repeated subscripts labels in one operand take the diagonal. For example, ``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``. Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a, b)`` is equivalent to ``np.inner(a,b)``. If a label appears only once, it is not summed, so ``np.einsum('i', a)`` produces a view of ``a`` with no changes. The order of labels in the output is by default alphabetical. This means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while ``np.einsum('ji', a)`` takes its transpose. The output can be controlled by specifying output subscript labels as well. This specifies the label order, and allows summing to be disallowed or forced when desired. The call ``np.einsum('i-', a)`` is equivalent to ``np.sum(a, axis=-1)``, and ``np.einsum('ii-i', a)`` is equivalent to ``np.diag(a)``. It is also possible to control how broadcasting occurs using an ellipsis. To take the trace along the first and last axes, you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix product with the left-most indices instead of rightmost, you can do ``np.einsum('ij...,jk...-ik...', a, b)``. When there is only one operand, no axes are summed, and no output parameter is provided, a view into the operand is returned instead of a new array. Thus, taking the diagonal as ``np.einsum('ii-i', a)`` produces a view. Parameters -- subscripts : string Specifies the subscripts for summation. operands : list of array_like These are the arrays for the operation. out : None or array If provided, the calculation is done into this array. dtype : None or data type If provided, forces the calculation to use the data type specified. Note that you may have to also give a more liberal ``casting`` parameter to allow the conversions. order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be Fortran contiguous. 'F' means it should be Fortran contiguous, 'A' means it should be 'F' if the inputs are all 'F', 'C' otherwise. 'K' means it should be as close to the layout as the inputs as is possible, including arbitrarily permuted axes. casting : 'no', 'equiv', 'safe', 'same_kind', 'unsafe' Controls what kind of data casting may occur. Setting this to 'unsafe' is not recommended, as it can adversely affect accumulations. 'no' means the data types should not be cast at all. 'equiv' means only byte-order changes are allowed. 'safe' means only casts which can preserve values are allowed. 'same_kind' means only safe casts or casts within a kind, like float64 to float32, are allowed. 'unsafe' means any data conversions may be done. Returns --- output : ndarray The calculation based on the Einstein summation convention. See Also dot, inner, outer, tensordot Examples a = np.arange(25).reshape(5,5) b = np.arange(5) c = np.arange(6).reshape(2,3)
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 11:27 AM, Mark Wiebe mwwi...@gmail.com wrote: I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing. This sounds really cool! I've definitely considered doing something like this previously, but never really got around to seriously figuring out any sensible API. Do you have the source up somewhere? I'd love to try it out myself. --Josh ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 1:36 PM, Joshua Holbrook josh.holbr...@gmail.comwrote: On Wed, Jan 26, 2011 at 11:27 AM, Mark Wiebe mwwi...@gmail.com wrote: I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing. This sounds really cool! I've definitely considered doing something like this previously, but never really got around to seriously figuring out any sensible API. Do you have the source up somewhere? I'd love to try it out myself. You can check out the new_iterator branch from here: https://github.com/m-paradox/numpy $ git clone https://github.com/m-paradox/numpy.git Cloning into numpy... -Mark ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 12:48 PM, Mark Wiebe mwwi...@gmail.com wrote: On Wed, Jan 26, 2011 at 1:36 PM, Joshua Holbrook josh.holbr...@gmail.com wrote: On Wed, Jan 26, 2011 at 11:27 AM, Mark Wiebe mwwi...@gmail.com wrote: I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing. This sounds really cool! I've definitely considered doing something like this previously, but never really got around to seriously figuring out any sensible API. Do you have the source up somewhere? I'd love to try it out myself. You can check out the new_iterator branch from here: https://github.com/m-paradox/numpy $ git clone https://github.com/m-paradox/numpy.git Cloning into numpy... -Mark Thanks for the link! How closely coupled is this new code with numpy's internals? That is, could you factor it out into its own package? If so, then people could have immediate use out of it without having to integrate it into numpy proper. --Josh ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 2:01 PM, Joshua Holbrook josh.holbr...@gmail.comwrote: snip How closely coupled is this new code with numpy's internals? That is, could you factor it out into its own package? If so, then people could have immediate use out of it without having to integrate it into numpy proper. The code depends heavily on the iterator I wrote, and I think the idea itself depends on having a good dynamic multi-dimensional array library. When the numpy-refactor branch is complete, this would be part of libndarray, and could be used directly from C without depending on Python. -Mark ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 16:43, Mark Wiebe mwwi...@gmail.com wrote: On Wed, Jan 26, 2011 at 2:01 PM, Joshua Holbrook josh.holbr...@gmail.com wrote: snip How closely coupled is this new code with numpy's internals? That is, could you factor it out into its own package? If so, then people could have immediate use out of it without having to integrate it into numpy proper. The code depends heavily on the iterator I wrote, and I think the idea itself depends on having a good dynamic multi-dimensional array library. When the numpy-refactor branch is complete, this would be part of libndarray, and could be used directly from C without depending on Python. It think his real question is whether einsum() and the iterator stuff can live in a separate module that *uses* a released version of numpy rather than a development branch. -- Robert Kern I have come to believe that the whole world is an enigma, a harmless enigma that is made terrible by our own mad attempt to interpret it as though it had an underlying truth. -- Umberto Eco ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
It think his real question is whether einsum() and the iterator stuff can live in a separate module that *uses* a released version of numpy rather than a development branch. -- Robert Kern Indeed, I would like to be able to install and use einsum() without having to install another version of numpy. Even if it depends on features of a new numpy, it'd be nice to have it be a separate module. --Josh ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
Mark, interesting idea. Given the fact that in 2-d euclidean metric, the Einstein summation conventions are only a way to write out conventional matrix multiplications, do you consider at some point to include a non-euclidean metric in this thing? (As you have in special relativity, for example) Something along the lines: eta = np.diag(-1,1,1,1) a = np.array(1,2,3,4) b = np.array(1,1,1,1) such that einsum('i,i', a,b, metric=eta) = -1 + 2 + 3 + 4 I don't know how useful it would be, just a thought, Hanno Am 26.01.2011 um 21:27 schrieb Mark Wiebe: I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Thu, Jan 27, 2011 at 12:18:30AM +0100, Hanno Klemm wrote: interesting idea. Given the fact that in 2-d euclidean metric, the Einstein summation conventions are only a way to write out conventional matrix multiplications, do you consider at some point to include a non-euclidean metric in this thing? (As you have in special relativity, for example) In my experience, Einstein summation conventions are quite incomprehensible for people who haven't studies relativity (they aren't used much outside some narrow fields of physics). If you start adding metrics, you'll make it even harder for people to follow. My 2 cents, Gaël ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 3:05 PM, Joshua Holbrook josh.holbr...@gmail.comwrote: It think his real question is whether einsum() and the iterator stuff can live in a separate module that *uses* a released version of numpy rather than a development branch. -- Robert Kern Indeed, I would like to be able to install and use einsum() without having to install another version of numpy. Even if it depends on features of a new numpy, it'd be nice to have it be a separate module. --Josh Ah, sorry for misunderstanding. That would actually be very difficult, as the iterator required a fair bit of fixes and adjustments to the core. The new_iterator branch should be 1.5 ABI compatible, if that helps. -Mark ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 3:18 PM, Hanno Klemm kl...@phys.ethz.ch wrote: Mark, interesting idea. Given the fact that in 2-d euclidean metric, the Einstein summation conventions are only a way to write out conventional matrix multiplications, do you consider at some point to include a non-euclidean metric in this thing? (As you have in special relativity, for example) Something along the lines: eta = np.diag(-1,1,1,1) a = np.array(1,2,3,4) b = np.array(1,1,1,1) such that einsum('i,i', a,b, metric=eta) = -1 + 2 + 3 + 4 This particular example is already doable as follows: eta = np.diag([-1,1,1,1]) eta array([[-1, 0, 0, 0], [ 0, 1, 0, 0], [ 0, 0, 1, 0], [ 0, 0, 0, 1]]) a = np.array([1,2,3,4]) b = np.array([1,1,1,1]) np.einsum('i,j,ij', a, b, eta) 8 I think that's right, did I understand you correctly? Cheers, Mark ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
Am 27.01.2011 um 00:29 schrieb Mark Wiebe: On Wed, Jan 26, 2011 at 3:18 PM, Hanno Klemm kl...@phys.ethz.ch wrote: Mark, interesting idea. Given the fact that in 2-d euclidean metric, the Einstein summation conventions are only a way to write out conventional matrix multiplications, do you consider at some point to include a non-euclidean metric in this thing? (As you have in special relativity, for example) Something along the lines: eta = np.diag(-1,1,1,1) a = np.array(1,2,3,4) b = np.array(1,1,1,1) such that einsum('i,i', a,b, metric=eta) = -1 + 2 + 3 + 4 This particular example is already doable as follows: eta = np.diag([-1,1,1,1]) eta array([[-1, 0, 0, 0], [ 0, 1, 0, 0], [ 0, 0, 1, 0], [ 0, 0, 0, 1]]) a = np.array([1,2,3,4]) b = np.array([1,1,1,1]) np.einsum('i,j,ij', a, b, eta) 8 I think that's right, did I understand you correctly? Cheers, Mark Yes, that's what I had in mind. Thanks. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wednesday, January 26, 2011, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Thu, Jan 27, 2011 at 12:18:30AM +0100, Hanno Klemm wrote: interesting idea. Given the fact that in 2-d euclidean metric, the Einstein summation conventions are only a way to write out conventional matrix multiplications, do you consider at some point to include a non-euclidean metric in this thing? (As you have in special relativity, for example) In my experience, Einstein summation conventions are quite incomprehensible for people who haven't studies relativity (they aren't used much outside some narrow fields of physics). If you start adding metrics, you'll make it even harder for people to follow. My 2 cents, Gaël Just to dispel the notion that Einstein notation is only used in the study of relativity, I can personally attest that Einstein notation is used in the field of fluid dynamics and some aspects of meteorology. This is really a neat idea and I support the idea of packaging it as a separate module. Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
Ah, sorry for misunderstanding. That would actually be very difficult, as the iterator required a fair bit of fixes and adjustments to the core. The new_iterator branch should be 1.5 ABI compatible, if that helps. I see. Perhaps the fixes and adjustments can/should be included with numpy standard, even if the Einstein notation package is made a separate module. Just to dispel the notion that Einstein notation is only used in the study of relativity, I can personally attest that Einstein notation is used in the field of fluid dynamics and some aspects of meteorology. Einstein notation is also used in solid mechanics. --Josh ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 5:02 PM, Joshua Holbrook josh.holbr...@gmail.com wrote: Ah, sorry for misunderstanding. That would actually be very difficult, as the iterator required a fair bit of fixes and adjustments to the core. The new_iterator branch should be 1.5 ABI compatible, if that helps. I see. Perhaps the fixes and adjustments can/should be included with numpy standard, even if the Einstein notation package is made a separate module. snip Indeed, I would like to be able to install and use einsum() without having to install another version of numpy. Even if it depends on features of a new numpy, it'd be nice to have it be a separate module. I don't really understand the desire to have this single function exist in a separate package. If it requires the new version of NumPy, then you'll have to install/upgrade either way...and if it comes as part of that new NumPy, then you are already set. Doesn't a separate package complicate things unnecessarily? It make sense to me if einsum consisted of many functions (such as Bottleneck). ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 7:35 PM, Benjamin Root ben.r...@ou.edu wrote: On Wednesday, January 26, 2011, Gael Varoquaux gael.varoqu...@normalesup.org wrote: On Thu, Jan 27, 2011 at 12:18:30AM +0100, Hanno Klemm wrote: interesting idea. Given the fact that in 2-d euclidean metric, the Einstein summation conventions are only a way to write out conventional matrix multiplications, do you consider at some point to include a non-euclidean metric in this thing? (As you have in special relativity, for example) In my experience, Einstein summation conventions are quite incomprehensible for people who haven't studies relativity (they aren't used much outside some narrow fields of physics). If you start adding metrics, you'll make it even harder for people to follow. My 2 cents, Gaël Just to dispel the notion that Einstein notation is only used in the study of relativity, I can personally attest that Einstein notation is used in the field of fluid dynamics and some aspects of meteorology. This is really a neat idea and I support the idea of packaging it as a separate module. So, if I read the examples correctly we finally get dot along an axis np.einsum('ijk,ji-', a, b) np.einsum('ijk,jik-k', a, b) or something like this. the notation might require getting used to but it doesn't look worse than figuring out what tensordot does. The only disadvantage I see, is that choosing the axes to operate on in a program or function requires string manipulation. Josef Ben Root ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
Nice function, and wonderful that it speeds some tasks up. some feedback: the following notation is a little counter intuitive to me: np.einsum('i...-', a) array([50, 55, 60, 65, 70]) np.sum(a, axis=0) array([50, 55, 60, 65, 70]) Since there is nothing after the -, I expected a scalar not a vector. I might suggest 'i...-...' Just noticed also a typo in the doc: order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be Fortran contiguous. 'F' means it should be Fortran contiguous, should be changed to order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be C contiguous. 'F' means it should be Fortran contiguous, Hope this helps, Jonathan On Wed, Jan 26, 2011 at 2:27 PM, Mark Wiebe mwwi...@gmail.com wrote: I wrote a new function, einsum, which implements Einstein summation notation, and I'd like comments/thoughts from people who might be interested in this kind of thing. In testing it, it is also faster than many of NumPy's built-in functions, except for dot and inner. At the bottom of this email you can find the documentation blurb I wrote for it, and here are some timings: In [1]: import numpy as np In [2]: a = np.arange(25).reshape(5,5) In [3]: timeit np.einsum('ii', a) 10 loops, best of 3: 3.45 us per loop In [4]: timeit np.trace(a) 10 loops, best of 3: 9.8 us per loop In [5]: timeit np.einsum('ii-i', a) 100 loops, best of 3: 1.19 us per loop In [6]: timeit np.diag(a) 10 loops, best of 3: 7 us per loop In [7]: b = np.arange(30).reshape(5,6) In [8]: timeit np.einsum('ij,jk', a, b) 1 loops, best of 3: 11.4 us per loop In [9]: timeit np.dot(a, b) 10 loops, best of 3: 2.8 us per loop In [10]: a = np.arange(1.) In [11]: timeit np.einsum('i-', a) 1 loops, best of 3: 22.1 us per loop In [12]: timeit np.sum(a) 1 loops, best of 3: 25.5 us per loop -Mark The documentation: einsum(subscripts, *operands, out=None, dtype=None, order='K', casting='safe') Evaluates the Einstein summation convention on the operands. Using the Einstein summation convention, many common multi-dimensional array operations can be represented in a simple fashion. This function provides a way compute such summations. The best way to understand this function is to try the examples below, which show how many common NumPy functions can be implemented as calls to einsum. The subscripts string is a comma-separated list of subscript labels, where each label refers to a dimension of the corresponding operand. Repeated subscripts labels in one operand take the diagonal. For example, ``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``. Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a, b)`` is equivalent to ``np.inner(a,b)``. If a label appears only once, it is not summed, so ``np.einsum('i', a)`` produces a view of ``a`` with no changes. The order of labels in the output is by default alphabetical. This means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while ``np.einsum('ji', a)`` takes its transpose. The output can be controlled by specifying output subscript labels as well. This specifies the label order, and allows summing to be disallowed or forced when desired. The call ``np.einsum('i-', a)`` is equivalent to ``np.sum(a, axis=-1)``, and ``np.einsum('ii-i', a)`` is equivalent to ``np.diag(a)``. It is also possible to control how broadcasting occurs using an ellipsis. To take the trace along the first and last axes, you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix product with the left-most indices instead of rightmost, you can do ``np.einsum('ij...,jk...-ik...', a, b)``. When there is only one operand, no axes are summed, and no output parameter is provided, a view into the operand is returned instead of a new array. Thus, taking the diagonal as ``np.einsum('ii-i', a)`` produces a view. Parameters -- subscripts : string Specifies the subscripts for summation. operands : list of array_like These are the arrays for the operation. out : None or array If provided, the calculation is done into this array. dtype : None or data type If provided, forces the calculation to use the data type specified. Note that you may have to also give a more liberal ``casting`` parameter to allow the conversions. order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be Fortran contiguous. 'F' means it should be Fortran contiguous, 'A' means it should be 'F' if the inputs are all 'F', 'C' otherwise. 'K' means it should be as close to the layout as the inputs
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 6:41 PM, Jonathan Rocher jroc...@enthought.comwrote: Nice function, and wonderful that it speeds some tasks up. some feedback: the following notation is a little counter intuitive to me: np.einsum('i...-', a) array([50, 55, 60, 65, 70]) np.sum(a, axis=0) array([50, 55, 60, 65, 70]) Since there is nothing after the -, I expected a scalar not a vector. I might suggest 'i...-...' Hmm, the dimension that's left is a a broadcast dimension, and the dimension labeled 'i' did go away. I suppose disallowing the empty output string and forcing a '...' is reasonable. Would disallowing broadcasting by default be a good approach? Then, einsum('ii-i', a) would only except two dimensional inputs, and you would have to specify einsum('...ii-...i', a) to get the current default behavior for it. Just noticed also a typo in the doc: order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be Fortran contiguous. 'F' means it should be Fortran contiguous, should be changed to order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be C contiguous. 'F' means it should be Fortran contiguous, Thanks, Mark ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 5:23 PM, josef.p...@gmail.com wrote: snip So, if I read the examples correctly we finally get dot along an axis np.einsum('ijk,ji-', a, b) np.einsum('ijk,jik-k', a, b) or something like this. the notation might require getting used to but it doesn't look worse than figuring out what tensordot does. I thought of various extensions to the notation, but the idea is tricky enough as is I think. Decoding a regex-like syntax probably wouldn't help. The only disadvantage I see, is that choosing the axes to operate on in a program or function requires string manipulation. One possibility would be for the Python exposure to accept lists or tuples of integers. The subscript 'ii' could be [(0,0)], and 'ij,jk-ik' could be [(0,1), (1,2), (0,2)]. Internally it would convert this directly to a C-string to pass to the API function. -Mark ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
The only disadvantage I see, is that choosing the axes to operate on in a program or function requires string manipulation. One possibility would be for the Python exposure to accept lists or tuples of integers. The subscript 'ii' could be [(0,0)], and 'ij,jk-ik' could be [(0,1), (1,2), (0,2)]. Internally it would convert this directly to a C-string to pass to the API function. -Mark What if you made objects i, j, etc. such that i*j = (0, 1) and etcetera? Maybe you could generate them with something like (i, j, k) = einstein((1, 2, 3)) . Feel free to disregard me since I haven't really thought too hard about things and might not even really understand what the problem is *anyway*. I'm just trying to help brainstorm. :) --Josh ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] einsum
On Wed, Jan 26, 2011 at 8:29 PM, Joshua Holbrook josh.holbr...@gmail.comwrote: The only disadvantage I see, is that choosing the axes to operate on in a program or function requires string manipulation. One possibility would be for the Python exposure to accept lists or tuples of integers. The subscript 'ii' could be [(0,0)], and 'ij,jk-ik' could be [(0,1), (1,2), (0,2)]. Internally it would convert this directly to a C-string to pass to the API function. -Mark What if you made objects i, j, etc. such that i*j = (0, 1) and etcetera? Maybe you could generate them with something like (i, j, k) = einstein((1, 2, 3)) . Feel free to disregard me since I haven't really thought too hard about things and might not even really understand what the problem is *anyway*. I'm just trying to help brainstorm. :) No worries. :) The problem is that someone will probably want to dynamically generate the axes to process in a loop, rather than having them hardcoded beforehand. For example, generalizing the diag function as follows. Within Python, creating lists and tuples is probably more natural. -Mark def diagij(x, i, j): ... ss = ... so = ... # should error check i, j ... fill = ord('b') ... for k in range(x.ndim): ... if k in [i, j]: ... ss += 'a' ... else: ... ss += chr(fill) ... so += chr(fill) ... fill += 1 ... ss += '-' + so + 'a' ... return np.einsum(ss, x) ... x = np.arange(3*3*3).reshape(3,3,3) diagij(x, 0, 1) array([[ 0, 12, 24], [ 1, 13, 25], [ 2, 14, 26]]) [np.diag(x[:,:,i]) for i in range(3)] [array([ 0, 12, 24]), array([ 1, 13, 25]), array([ 2, 14, 26])] diagij(x, 1, 2) array([[ 0, 4, 8], [ 9, 13, 17], [18, 22, 26]]) ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion