Nice function, and wonderful that it speeds some tasks up. some feedback: the following notation is a little counter intuitive to me: >>> np.einsum('i...->', a) array([50, 55, 60, 65, 70]) >>> np.sum(a, axis=0) array([50, 55, 60, 65, 70]) Since there is nothing after the ->, I expected a scalar not a vector. I might suggest 'i...->...'
Just noticed also a typo in the doc: order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be Fortran contiguous. 'F' means it should be Fortran contiguous, should be changed to order : 'C', 'F', 'A', or 'K' Controls the memory layout of the output. 'C' means it should be C contiguous. 'F' means it should be Fortran contiguous, Hope this helps, Jonathan On Wed, Jan 26, 2011 at 2:27 PM, Mark Wiebe <mwwi...@gmail.com> wrote: > I wrote a new function, einsum, which implements Einstein summation > notation, and I'd like comments/thoughts from people who might be interested > in this kind of thing. > > In testing it, it is also faster than many of NumPy's built-in functions, > except for dot and inner. At the bottom of this email you can find the > documentation blurb I wrote for it, and here are some timings: > > In [1]: import numpy as np > In [2]: a = np.arange(25).reshape(5,5) > > In [3]: timeit np.einsum('ii', a) > 100000 loops, best of 3: 3.45 us per loop > In [4]: timeit np.trace(a) > 100000 loops, best of 3: 9.8 us per loop > > In [5]: timeit np.einsum('ii->i', a) > 1000000 loops, best of 3: 1.19 us per loop > In [6]: timeit np.diag(a) > 100000 loops, best of 3: 7 us per loop > > In [7]: b = np.arange(30).reshape(5,6) > > In [8]: timeit np.einsum('ij,jk', a, b) > 10000 loops, best of 3: 11.4 us per loop > In [9]: timeit np.dot(a, b) > 100000 loops, best of 3: 2.8 us per loop > > In [10]: a = np.arange(10000.) > > In [11]: timeit np.einsum('i->', a) > 10000 loops, best of 3: 22.1 us per loop > In [12]: timeit np.sum(a) > 10000 loops, best of 3: 25.5 us per loop > > -Mark > > The documentation: > > einsum(subscripts, *operands, out=None, dtype=None, order='K', > casting='safe') > > Evaluates the Einstein summation convention on the operands. > > Using the Einstein summation convention, many common multi-dimensional > array operations can be represented in a simple fashion. This function > provides a way compute such summations. > > The best way to understand this function is to try the examples below, > which show how many common NumPy functions can be implemented as > calls to einsum. > > The subscripts string is a comma-separated list of subscript labels, > where each label refers to a dimension of the corresponding operand. > Repeated subscripts labels in one operand take the diagonal. For > example, > ``np.einsum('ii', a)`` is equivalent to ``np.trace(a)``. > > Whenever a label is repeated, it is summed, so ``np.einsum('i,i', a, > b)`` > is equivalent to ``np.inner(a,b)``. If a label appears only once, > it is not summed, so ``np.einsum('i', a)`` produces a view of ``a`` > with no changes. > > The order of labels in the output is by default alphabetical. This > means that ``np.einsum('ij', a)`` doesn't affect a 2D array, while > ``np.einsum('ji', a)`` takes its transpose. > > The output can be controlled by specifying output subscript labels > as well. This specifies the label order, and allows summing to be > disallowed or forced when desired. The call ``np.einsum('i->', a)`` > is equivalent to ``np.sum(a, axis=-1)``, and > ``np.einsum('ii->i', a)`` is equivalent to ``np.diag(a)``. > > It is also possible to control how broadcasting occurs using > an ellipsis. To take the trace along the first and last axes, > you can do ``np.einsum('i...i', a)``, or to do a matrix-matrix > product with the left-most indices instead of rightmost, you can do > ``np.einsum('ij...,jk...->ik...', a, b)``. > > When there is only one operand, no axes are summed, and no output > parameter is provided, a view into the operand is returned instead > of a new array. Thus, taking the diagonal as ``np.einsum('ii->i', a)`` > produces a view. > > Parameters > ---------- > subscripts : string > Specifies the subscripts for summation. > operands : list of array_like > These are the arrays for the operation. > out : None or array > If provided, the calculation is done into this array. > dtype : None or data type > If provided, forces the calculation to use the data type specified. > Note that you may have to also give a more liberal ``casting`` > parameter to allow the conversions. > order : 'C', 'F', 'A', or 'K' > Controls the memory layout of the output. 'C' means it should > be Fortran contiguous. 'F' means it should be Fortran contiguous, > 'A' means it should be 'F' if the inputs are all 'F', 'C' > otherwise. > 'K' means it should be as close to the layout as the inputs as > is possible, including arbitrarily permuted axes. > casting : 'no', 'equiv', 'safe', 'same_kind', 'unsafe' > Controls what kind of data casting may occur. Setting this to > 'unsafe' is not recommended, as it can adversely affect > accumulations. > 'no' means the data types should not be cast at all. 'equiv' means > only byte-order changes are allowed. 'safe' means only casts > which can preserve values are allowed. 'same_kind' means only > safe casts or casts within a kind, like float64 to float32, are > allowed. 'unsafe' means any data conversions may be done. > > Returns > ------- > output : ndarray > The calculation based on the Einstein summation convention. > > See Also > -------- > dot, inner, outer, tensordot > > > Examples > -------- > > >>> a = np.arange(25).reshape(5,5) > >>> b = np.arange(5) > >>> c = np.arange(6).reshape(2,3) > > >>> np.einsum('ii', a) > 60 > >>> np.trace(a) > 60 > > >>> np.einsum('ii->i', a) > array([ 0, 6, 12, 18, 24]) > >>> np.diag(a) > array([ 0, 6, 12, 18, 24]) > > >>> np.einsum('ij,j', a, b) > array([ 30, 80, 130, 180, 230]) > >>> np.dot(a, b) > array([ 30, 80, 130, 180, 230]) > > >>> np.einsum('ji', c) > array([[0, 3], > [1, 4], > [2, 5]]) > >>> c.T > array([[0, 3], > [1, 4], > [2, 5]]) > > >>> np.einsum(',', 3, c) > array([[ 0, 3, 6], > [ 9, 12, 15]]) > >>> np.multiply(3, c) > array([[ 0, 3, 6], > [ 9, 12, 15]]) > > >>> np.einsum('i,i', b, b) > 30 > >>> np.inner(b,b) > 30 > > >>> np.einsum('i,j', np.arange(2)+1, b) > array([[0, 1, 2, 3, 4], > [0, 2, 4, 6, 8]]) > >>> np.outer(np.arange(2)+1, b) > array([[0, 1, 2, 3, 4], > [0, 2, 4, 6, 8]]) > > >>> np.einsum('i...->', a) > array([50, 55, 60, 65, 70]) > >>> np.sum(a, axis=0) > array([50, 55, 60, 65, 70]) > > >>> a = np.arange(60.).reshape(3,4,5) > >>> b = np.arange(24.).reshape(4,3,2) > >>> np.einsum('ijk,jil->kl', a, b) > array([[ 4400., 4730.], > [ 4532., 4874.], > [ 4664., 5018.], > [ 4796., 5162.], > [ 4928., 5306.]]) > >>> np.tensordot(a,b, axes=([1,0],[0,1])) > array([[ 4400., 4730.], > [ 4532., 4874.], > [ 4664., 5018.], > [ 4796., 5162.], > [ 4928., 5306.]]) > > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > > -- Jonathan Rocher, Enthought, Inc. jroc...@enthought.com 1-512-536-1057 http://www.enthought.com
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion