Re: [Numpy-discussion] ENH: compute many inner products quickly
On Mon, Jun 6, 2016 at 3:32 PM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > Since we are at it, should quadratic/bilinear forms get their own function > too? That is, after all, what the OP was asking for. > If we have matvecmul and vecmul, then how to implement bilinear forms efficiently becomes pretty clear: np.vecmul(b, np.matvecmul(A, b)) I'm not sure writing a dedicated function in numpy itself makes sense for something this easy. I suppose there would be some performance gains from not saving the immediate result, but I suspect this would be premature optimization in most cases. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Sun, Jun 5, 2016 at 5:41 PM, Stephan Hoyer wrote: > If possible, I'd love to add new functions for "generalized ufunc" linear > algebra, and then deprecate (or at least discourage) using the older > versions with inferior broadcasting rules. Adding a new keyword arg means > we'll be stuck with an awkward API for a long time to come. > > There are three types of matrix/vector products for which ufuncs would be > nice: > 1. matrix-matrix product (covered by matmul) > 2. matrix-vector product > 3. vector-vector (inner) product > > It's straightful to implement either of the later two options by inserting > dummy dimensions and then calling matmul, but that's a pretty awkward API, > especially for inner products. Unfortunately, we already use the two most > obvious one word names for vector inner products (inner and dot). But on the > other hand, one word names are not very descriptive, and the short name > "dot" probably mostly exists because of the lack of an infix operator. > > So I'll start by throwing out some potential new names: > > For matrix-vector products: > matvecmul (if it's worth making a new operator) > > For inner products: > vecmul (similar to matmul, but probably too ambiguous) > dot_product > inner_prod > inner_product Given how core to linear algebra these are, and that this is a family of somewhat expert-oriented functions, I think it'd even be fine to leave the "product" part implicit, like: np.linalg.matrix_matrix np.linalg.matrix_vector np.linalg.vector_matrix np.linalg.vector_vector np.linalg.vector_matrix_vector (for bilinear forms) (or we could shorten matrix -> mat, vector -> vec if we must.) -n -- Nathaniel J. Smith -- https://vorpus.org ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Di, 2016-06-07 at 00:32 +0200, Jaime Fernández del Río wrote: > On Mon, Jun 6, 2016 at 9:35 AM, Sebastian Berg s.net> wrote: > > On So, 2016-06-05 at 19:20 -0600, Charles R Harris wrote: > > > > > > > > > On Sun, Jun 5, 2016 at 6:41 PM, Stephan Hoyer > > > wrote: > > > > If possible, I'd love to add new functions for "generalized > > ufunc" > > > > linear algebra, and then deprecate (or at least discourage) > > using > > > > the older versions with inferior broadcasting rules. Adding a > > new > > > > keyword arg means we'll be stuck with an awkward API for a long > > > > time to come. > > > > > > > > There are three types of matrix/vector products for which > > ufuncs > > > > would be nice: > > > > 1. matrix-matrix product (covered by matmul) > > > > 2. matrix-vector product > > > > 3. vector-vector (inner) product > > > > > > > > It's straightful to implement either of the later two options > > by > > > > inserting dummy dimensions and then calling matmul, but that's > > a > > > > pretty awkward API, especially for inner products. > > Unfortunately, > > > > we already use the two most obvious one word names for vector > > inner > > > > products (inner and dot). But on the other hand, one word names > > are > > > > not very descriptive, and the short name "dot" probably mostly > > > > exists because of the lack of an infix operator. > > > > > > > > So I'll start by throwing out some potential new names: > > > > > > > > For matrix-vector products: > > > > matvecmul (if it's worth making a new operator) > > > > > > > > For inner products: > > > > vecmul (similar to matmul, but probably too ambiguous) > > > > dot_product > > > > inner_prod > > > > inner_product > > > > > > > I was using mulmatvec, mulvecmat, mulvecvec back when I was > > looking > > > at this. I suppose the mul could also go in the middle, or maybe > > > change it to x and put it in the middle: matxvec, vecxmat, > > vecxvec. > > > > > > > Were not some of this part of the gufunc linalg functions and we > > just > > removed it because we were not sure about the API? Not sure > > anymore, > > but might be worth to have a look. > We have > from numpy.core.umath_tests import inner1d > which does vectorized vector-vector multiplication, but it's > undocumented. There is also a matrix_multiply in that same module > that does the obvious thing. > And when gufuncs were introduced in linalg, there were a bunch of > functions doing all sorts of operations without intermediate storage, > e.g. sum3(a, b, c) -> a + b + c, that were removed before merging the > PR. Wasn't involved at the time, so not sure what the rationale was. I think it was probably just that the api was not thought out much. Adding sum3 to linalg does seem a bit funny ;). I would not mind it in numpy as such I guess, if it quite a bit faster anyway, but maybe in its own submodule for these kind of performance optimizations. - Sebastian > Since we are at it, should quadratic/bilinear forms get their own > function too? That is, after all, what the OP was asking for. > Jaime > -- > (\__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus > planes de dominación mundial. > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Mon, Jun 6, 2016 at 9:35 AM, Sebastian Berg wrote: > On So, 2016-06-05 at 19:20 -0600, Charles R Harris wrote: > > > > > > On Sun, Jun 5, 2016 at 6:41 PM, Stephan Hoyer > > wrote: > > > If possible, I'd love to add new functions for "generalized ufunc" > > > linear algebra, and then deprecate (or at least discourage) using > > > the older versions with inferior broadcasting rules. Adding a new > > > keyword arg means we'll be stuck with an awkward API for a long > > > time to come. > > > > > > There are three types of matrix/vector products for which ufuncs > > > would be nice: > > > 1. matrix-matrix product (covered by matmul) > > > 2. matrix-vector product > > > 3. vector-vector (inner) product > > > > > > It's straightful to implement either of the later two options by > > > inserting dummy dimensions and then calling matmul, but that's a > > > pretty awkward API, especially for inner products. Unfortunately, > > > we already use the two most obvious one word names for vector inner > > > products (inner and dot). But on the other hand, one word names are > > > not very descriptive, and the short name "dot" probably mostly > > > exists because of the lack of an infix operator. > > > > > > So I'll start by throwing out some potential new names: > > > > > > For matrix-vector products: > > > matvecmul (if it's worth making a new operator) > > > > > > For inner products: > > > vecmul (similar to matmul, but probably too ambiguous) > > > dot_product > > > inner_prod > > > inner_product > > > > > I was using mulmatvec, mulvecmat, mulvecvec back when I was looking > > at this. I suppose the mul could also go in the middle, or maybe > > change it to x and put it in the middle: matxvec, vecxmat, vecxvec. > > > > Were not some of this part of the gufunc linalg functions and we just > removed it because we were not sure about the API? Not sure anymore, > but might be worth to have a look. > We have from numpy.core.umath_tests import inner1d which does vectorized vector-vector multiplication, but it's undocumented. There is also a matrix_multiply in that same module that does the obvious thing. And when gufuncs were introduced in linalg, there were a bunch of functions doing all sorts of operations without intermediate storage, e.g. sum3(a, b, c) -> a + b + c, that were removed before merging the PR. Wasn't involved at the time, so not sure what the rationale was. Since we are at it, should quadratic/bilinear forms get their own function too? That is, after all, what the OP was asking for. Jaime -- (\__/) ( O.o) ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de dominación mundial. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
There I was thinking vector-vector inner product was in fact covered by `np.inner`. Yikes, half inner, half outer. As for names, I think `matvecmul` and `vecmul` do seem quite OK (probably need `vecmatmul` as well, which does the same as `matmul` would for 1-D first argument). But as other suggestions, keeping the `dot` one could think of `vec_dot_vec` and `mat_dot_vec`, etc. More obscure but shorter would be to use the equivalent `einsum` notation: `i_i`, `ij_j`, `i_ij`, `ij_jk`. -- Marten ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On So, 2016-06-05 at 19:20 -0600, Charles R Harris wrote: > > > On Sun, Jun 5, 2016 at 6:41 PM, Stephan Hoyer > wrote: > > If possible, I'd love to add new functions for "generalized ufunc" > > linear algebra, and then deprecate (or at least discourage) using > > the older versions with inferior broadcasting rules. Adding a new > > keyword arg means we'll be stuck with an awkward API for a long > > time to come. > > > > There are three types of matrix/vector products for which ufuncs > > would be nice: > > 1. matrix-matrix product (covered by matmul) > > 2. matrix-vector product > > 3. vector-vector (inner) product > > > > It's straightful to implement either of the later two options by > > inserting dummy dimensions and then calling matmul, but that's a > > pretty awkward API, especially for inner products. Unfortunately, > > we already use the two most obvious one word names for vector inner > > products (inner and dot). But on the other hand, one word names are > > not very descriptive, and the short name "dot" probably mostly > > exists because of the lack of an infix operator. > > > > So I'll start by throwing out some potential new names: > > > > For matrix-vector products: > > matvecmul (if it's worth making a new operator) > > > > For inner products: > > vecmul (similar to matmul, but probably too ambiguous) > > dot_product > > inner_prod > > inner_product > > > I was using mulmatvec, mulvecmat, mulvecvec back when I was looking > at this. I suppose the mul could also go in the middle, or maybe > change it to x and put it in the middle: matxvec, vecxmat, vecxvec. > Were not some of this part of the gufunc linalg functions and we just removed it because we were not sure about the API? Not sure anymore, but might be worth to have a look. - Sebastian > Chuck > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion signature.asc Description: This is a digitally signed message part ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Sun, Jun 5, 2016 at 6:41 PM, Stephan Hoyer wrote: > If possible, I'd love to add new functions for "generalized ufunc" linear > algebra, and then deprecate (or at least discourage) using the older > versions with inferior broadcasting rules. Adding a new keyword arg means > we'll be stuck with an awkward API for a long time to come. > > There are three types of matrix/vector products for which ufuncs would be > nice: > 1. matrix-matrix product (covered by matmul) > 2. matrix-vector product > 3. vector-vector (inner) product > > It's straightful to implement either of the later two options by inserting > dummy dimensions and then calling matmul, but that's a pretty awkward API, > especially for inner products. Unfortunately, we already use the two most > obvious one word names for vector inner products (inner and dot). But on > the other hand, one word names are not very descriptive, and the short name > "dot" probably mostly exists because of the lack of an infix operator. > > So I'll start by throwing out some potential new names: > > For matrix-vector products: > matvecmul (if it's worth making a new operator) > > For inner products: > vecmul (similar to matmul, but probably too ambiguous) > dot_product > inner_prod > inner_product > I was using mulmatvec, mulvecmat, mulvecvec back when I was looking at this. I suppose the mul could also go in the middle, or maybe change it to x and put it in the middle: matxvec, vecxmat, vecxvec. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
A simple workaround gets the speed back: In [11]: %timeit (X.T * A.dot(X.T)).sum(axis=0) 1 loop, best of 3: 612 ms per loop In [12]: %timeit np.einsum('ij,ji->j', A.dot(X.T), X) 1 loop, best of 3: 414 ms per loop If working as advertised, the code in gh-5488 will convert the three-argument einsum call into my version automatically. On Sun, Jun 5, 2016 at 7:44 PM, Stephan Hoyer wrote: > On Sun, Jun 5, 2016 at 5:08 PM, Mark Daoust wrote: > >> Here's the einsum version: >> >> `es = np.einsum('Na,ab,Nb->N',X,A,X)` >> >> But that's running ~45x slower than your version. >> >> OT: anyone know why einsum is so bad for this one? >> > > I think einsum can create some large intermediate arrays. It certainly > doesn't always do multiplication in the optimal order: > https://github.com/numpy/numpy/pull/5488 > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Sun, Jun 5, 2016 at 8:41 PM, Stephan Hoyer wrote: > If possible, I'd love to add new functions for "generalized ufunc" linear > algebra, and then deprecate (or at least discourage) using the older > versions with inferior broadcasting rules. Adding a new keyword arg means > we'll be stuck with an awkward API for a long time to come. > > There are three types of matrix/vector products for which ufuncs would be > nice: > 1. matrix-matrix product (covered by matmul) > 2. matrix-vector product > 3. vector-vector (inner) product > > It's straightful to implement either of the later two options by inserting > dummy dimensions and then calling matmul, but that's a pretty awkward API, > especially for inner products. Unfortunately, we already use the two most > obvious one word names for vector inner products (inner and dot). But on > the other hand, one word names are not very descriptive, and the short name > "dot" probably mostly exists because of the lack of an infix operator. > > So I'll start by throwing out some potential new names: > > For matrix-vector products: > matvecmul (if it's worth making a new operator) > > For inner products: > vecmul (similar to matmul, but probably too ambiguous) > dot_product > inner_prod > inner_product > > how about names in plural as in the PR I thought the `s` in inner_prods would signal better the broadcasting behavior dot_products ... "dots" ? (I guess not) Josef > > > > > On Sat, May 28, 2016 at 8:53 PM, Scott Sievert > wrote: > >> I recently ran into an application where I had to compute many inner >> products quickly (roughy 50k inner products in less than a second). I >> wanted a vector of inner products over the 50k vectors, or `[x1.T @ A @ x1, >> …, xn.T @ A @ xn]` with A.shape = (1k, 1k). >> >> My first instinct was to look for a NumPy function to quickly compute >> this, such as np.inner. However, it looks like np.inner has some other >> behavior and I couldn’t get tensordot/einsum to work for me. >> >> Then a labmate pointed out that I can just do some slick matrix >> multiplication to compute the same quantity, `(X.T * A @ X.T).sum(axis=0)`. >> I opened [a PR] with this, and proposed that we define a new function >> called `inner_prods` for this. >> >> However, in the PR, @shoyer pointed out >> >> > The main challenge is to figure out how to transition the behavior of >> all these operations, while preserving backwards compatibility. Quite >> likely, we need to pick new names for these functions, though we should try >> to pick something that doesn't suggest that they are second class >> alternatives. >> >> Do we choose new function names? Do we add a keyword arg that changes >> what np.inner returns? >> >> [a PR]:https://github.com/numpy/numpy/pull/7690 >> >> >> >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Sun, Jun 5, 2016 at 5:08 PM, Mark Daoust wrote: > Here's the einsum version: > > `es = np.einsum('Na,ab,Nb->N',X,A,X)` > > But that's running ~45x slower than your version. > > OT: anyone know why einsum is so bad for this one? > I think einsum can create some large intermediate arrays. It certainly doesn't always do multiplication in the optimal order: https://github.com/numpy/numpy/pull/5488 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
If possible, I'd love to add new functions for "generalized ufunc" linear algebra, and then deprecate (or at least discourage) using the older versions with inferior broadcasting rules. Adding a new keyword arg means we'll be stuck with an awkward API for a long time to come. There are three types of matrix/vector products for which ufuncs would be nice: 1. matrix-matrix product (covered by matmul) 2. matrix-vector product 3. vector-vector (inner) product It's straightful to implement either of the later two options by inserting dummy dimensions and then calling matmul, but that's a pretty awkward API, especially for inner products. Unfortunately, we already use the two most obvious one word names for vector inner products (inner and dot). But on the other hand, one word names are not very descriptive, and the short name "dot" probably mostly exists because of the lack of an infix operator. So I'll start by throwing out some potential new names: For matrix-vector products: matvecmul (if it's worth making a new operator) For inner products: vecmul (similar to matmul, but probably too ambiguous) dot_product inner_prod inner_product On Sat, May 28, 2016 at 8:53 PM, Scott Sievert wrote: > I recently ran into an application where I had to compute many inner > products quickly (roughy 50k inner products in less than a second). I > wanted a vector of inner products over the 50k vectors, or `[x1.T @ A @ x1, > …, xn.T @ A @ xn]` with A.shape = (1k, 1k). > > My first instinct was to look for a NumPy function to quickly compute > this, such as np.inner. However, it looks like np.inner has some other > behavior and I couldn’t get tensordot/einsum to work for me. > > Then a labmate pointed out that I can just do some slick matrix > multiplication to compute the same quantity, `(X.T * A @ X.T).sum(axis=0)`. > I opened [a PR] with this, and proposed that we define a new function > called `inner_prods` for this. > > However, in the PR, @shoyer pointed out > > > The main challenge is to figure out how to transition the behavior of > all these operations, while preserving backwards compatibility. Quite > likely, we need to pick new names for these functions, though we should try > to pick something that doesn't suggest that they are second class > alternatives. > > Do we choose new function names? Do we add a keyword arg that changes what > np.inner returns? > > [a PR]:https://github.com/numpy/numpy/pull/7690 > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
Here's the einsum version: `es = np.einsum('Na,ab,Nb->N',X,A,X)` But that's running ~45x slower than your version. OT: anyone know why einsum is so bad for this one? Mark Daoust On Sat, May 28, 2016 at 11:53 PM, Scott Sievert wrote: > I recently ran into an application where I had to compute many inner > products quickly (roughy 50k inner products in less than a second). I > wanted a vector of inner products over the 50k vectors, or `[x1.T @ A @ x1, > …, xn.T @ A @ xn]` with A.shape = (1k, 1k). > > My first instinct was to look for a NumPy function to quickly compute > this, such as np.inner. However, it looks like np.inner has some other > behavior and I couldn’t get tensordot/einsum to work for me. > > Then a labmate pointed out that I can just do some slick matrix > multiplication to compute the same quantity, `(X.T * A @ X.T).sum(axis=0)`. > I opened [a PR] with this, and proposed that we define a new function > called `inner_prods` for this. > > However, in the PR, @shoyer pointed out > > > The main challenge is to figure out how to transition the behavior of > all these operations, while preserving backwards compatibility. Quite > likely, we need to pick new names for these functions, though we should try > to pick something that doesn't suggest that they are second class > alternatives. > > Do we choose new function names? Do we add a keyword arg that changes what > np.inner returns? > > [a PR]:https://github.com/numpy/numpy/pull/7690 > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] ENH: compute many inner products quickly
I recently ran into an application where I had to compute many inner products quickly (roughy 50k inner products in less than a second). I wanted a vector of inner products over the 50k vectors, or `[x1.T @ A @ x1, …, xn.T @ A @ xn]` with A.shape = (1k, 1k). My first instinct was to look for a NumPy function to quickly compute this, such as np.inner. However, it looks like np.inner has some other behavior and I couldn’t get tensordot/einsum to work for me. Then a labmate pointed out that I can just do some slick matrix multiplication to compute the same quantity, `(X.T * A @ X.T).sum(axis=0)`. I opened [a PR] with this, and proposed that we define a new function called `inner_prods` for this. However, in the PR, @shoyer pointed out > The main challenge is to figure out how to transition the behavior of all >these operations, while preserving backwards compatibility. Quite likely, we >need to pick new names for these functions, though we should try to pick >something that doesn't suggest that they are second class alternatives. Do we choose new function names? Do we add a keyword arg that changes what np.inner returns? [a PR]:https://github.com/numpy/numpy/pull/7690 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion