Re: [Numpy-discussion] Integers to integer powers, let's make a decision
Charles R Harris wrote: >1. Integers to negative integer powers raise an error. >2. Integers to integer powers always results in floats. 2 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Sun, Jun 5, 2016 at 6:41 PM, Stephan Hoyer wrote: > If possible, I'd love to add new functions for "generalized ufunc" linear > algebra, and then deprecate (or at least discourage) using the older > versions with inferior broadcasting rules. Adding a new keyword arg means > we'll be stuck with an awkward API for a long time to come. > > There are three types of matrix/vector products for which ufuncs would be > nice: > 1. matrix-matrix product (covered by matmul) > 2. matrix-vector product > 3. vector-vector (inner) product > > It's straightful to implement either of the later two options by inserting > dummy dimensions and then calling matmul, but that's a pretty awkward API, > especially for inner products. Unfortunately, we already use the two most > obvious one word names for vector inner products (inner and dot). But on > the other hand, one word names are not very descriptive, and the short name > "dot" probably mostly exists because of the lack of an infix operator. > > So I'll start by throwing out some potential new names: > > For matrix-vector products: > matvecmul (if it's worth making a new operator) > > For inner products: > vecmul (similar to matmul, but probably too ambiguous) > dot_product > inner_prod > inner_product > I was using mulmatvec, mulvecmat, mulvecvec back when I was looking at this. I suppose the mul could also go in the middle, or maybe change it to x and put it in the middle: matxvec, vecxmat, vecxvec. Chuck ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
A simple workaround gets the speed back: In [11]: %timeit (X.T * A.dot(X.T)).sum(axis=0) 1 loop, best of 3: 612 ms per loop In [12]: %timeit np.einsum('ij,ji->j', A.dot(X.T), X) 1 loop, best of 3: 414 ms per loop If working as advertised, the code in gh-5488 will convert the three-argument einsum call into my version automatically. On Sun, Jun 5, 2016 at 7:44 PM, Stephan Hoyer wrote: > On Sun, Jun 5, 2016 at 5:08 PM, Mark Daoust wrote: > >> Here's the einsum version: >> >> `es = np.einsum('Na,ab,Nb->N',X,A,X)` >> >> But that's running ~45x slower than your version. >> >> OT: anyone know why einsum is so bad for this one? >> > > I think einsum can create some large intermediate arrays. It certainly > doesn't always do multiplication in the optimal order: > https://github.com/numpy/numpy/pull/5488 > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Sun, Jun 5, 2016 at 8:41 PM, Stephan Hoyer wrote: > If possible, I'd love to add new functions for "generalized ufunc" linear > algebra, and then deprecate (or at least discourage) using the older > versions with inferior broadcasting rules. Adding a new keyword arg means > we'll be stuck with an awkward API for a long time to come. > > There are three types of matrix/vector products for which ufuncs would be > nice: > 1. matrix-matrix product (covered by matmul) > 2. matrix-vector product > 3. vector-vector (inner) product > > It's straightful to implement either of the later two options by inserting > dummy dimensions and then calling matmul, but that's a pretty awkward API, > especially for inner products. Unfortunately, we already use the two most > obvious one word names for vector inner products (inner and dot). But on > the other hand, one word names are not very descriptive, and the short name > "dot" probably mostly exists because of the lack of an infix operator. > > So I'll start by throwing out some potential new names: > > For matrix-vector products: > matvecmul (if it's worth making a new operator) > > For inner products: > vecmul (similar to matmul, but probably too ambiguous) > dot_product > inner_prod > inner_product > > how about names in plural as in the PR I thought the `s` in inner_prods would signal better the broadcasting behavior dot_products ... "dots" ? (I guess not) Josef > > > > > On Sat, May 28, 2016 at 8:53 PM, Scott Sievert > wrote: > >> I recently ran into an application where I had to compute many inner >> products quickly (roughy 50k inner products in less than a second). I >> wanted a vector of inner products over the 50k vectors, or `[x1.T @ A @ x1, >> …, xn.T @ A @ xn]` with A.shape = (1k, 1k). >> >> My first instinct was to look for a NumPy function to quickly compute >> this, such as np.inner. However, it looks like np.inner has some other >> behavior and I couldn’t get tensordot/einsum to work for me. >> >> Then a labmate pointed out that I can just do some slick matrix >> multiplication to compute the same quantity, `(X.T * A @ X.T).sum(axis=0)`. >> I opened [a PR] with this, and proposed that we define a new function >> called `inner_prods` for this. >> >> However, in the PR, @shoyer pointed out >> >> > The main challenge is to figure out how to transition the behavior of >> all these operations, while preserving backwards compatibility. Quite >> likely, we need to pick new names for these functions, though we should try >> to pick something that doesn't suggest that they are second class >> alternatives. >> >> Do we choose new function names? Do we add a keyword arg that changes >> what np.inner returns? >> >> [a PR]:https://github.com/numpy/numpy/pull/7690 >> >> >> >> >> ___ >> NumPy-Discussion mailing list >> NumPy-Discussion@scipy.org >> https://mail.scipy.org/mailman/listinfo/numpy-discussion >> >> > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
On Sun, Jun 5, 2016 at 5:08 PM, Mark Daoust wrote: > Here's the einsum version: > > `es = np.einsum('Na,ab,Nb->N',X,A,X)` > > But that's running ~45x slower than your version. > > OT: anyone know why einsum is so bad for this one? > I think einsum can create some large intermediate arrays. It certainly doesn't always do multiplication in the optimal order: https://github.com/numpy/numpy/pull/5488 ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
If possible, I'd love to add new functions for "generalized ufunc" linear algebra, and then deprecate (or at least discourage) using the older versions with inferior broadcasting rules. Adding a new keyword arg means we'll be stuck with an awkward API for a long time to come. There are three types of matrix/vector products for which ufuncs would be nice: 1. matrix-matrix product (covered by matmul) 2. matrix-vector product 3. vector-vector (inner) product It's straightful to implement either of the later two options by inserting dummy dimensions and then calling matmul, but that's a pretty awkward API, especially for inner products. Unfortunately, we already use the two most obvious one word names for vector inner products (inner and dot). But on the other hand, one word names are not very descriptive, and the short name "dot" probably mostly exists because of the lack of an infix operator. So I'll start by throwing out some potential new names: For matrix-vector products: matvecmul (if it's worth making a new operator) For inner products: vecmul (similar to matmul, but probably too ambiguous) dot_product inner_prod inner_product On Sat, May 28, 2016 at 8:53 PM, Scott Sievert wrote: > I recently ran into an application where I had to compute many inner > products quickly (roughy 50k inner products in less than a second). I > wanted a vector of inner products over the 50k vectors, or `[x1.T @ A @ x1, > …, xn.T @ A @ xn]` with A.shape = (1k, 1k). > > My first instinct was to look for a NumPy function to quickly compute > this, such as np.inner. However, it looks like np.inner has some other > behavior and I couldn’t get tensordot/einsum to work for me. > > Then a labmate pointed out that I can just do some slick matrix > multiplication to compute the same quantity, `(X.T * A @ X.T).sum(axis=0)`. > I opened [a PR] with this, and proposed that we define a new function > called `inner_prods` for this. > > However, in the PR, @shoyer pointed out > > > The main challenge is to figure out how to transition the behavior of > all these operations, while preserving backwards compatibility. Quite > likely, we need to pick new names for these functions, though we should try > to pick something that doesn't suggest that they are second class > alternatives. > > Do we choose new function names? Do we add a keyword arg that changes what > np.inner returns? > > [a PR]:https://github.com/numpy/numpy/pull/7690 > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] ENH: compute many inner products quickly
Here's the einsum version: `es = np.einsum('Na,ab,Nb->N',X,A,X)` But that's running ~45x slower than your version. OT: anyone know why einsum is so bad for this one? Mark Daoust On Sat, May 28, 2016 at 11:53 PM, Scott Sievert wrote: > I recently ran into an application where I had to compute many inner > products quickly (roughy 50k inner products in less than a second). I > wanted a vector of inner products over the 50k vectors, or `[x1.T @ A @ x1, > …, xn.T @ A @ xn]` with A.shape = (1k, 1k). > > My first instinct was to look for a NumPy function to quickly compute > this, such as np.inner. However, it looks like np.inner has some other > behavior and I couldn’t get tensordot/einsum to work for me. > > Then a labmate pointed out that I can just do some slick matrix > multiplication to compute the same quantity, `(X.T * A @ X.T).sum(axis=0)`. > I opened [a PR] with this, and proposed that we define a new function > called `inner_prods` for this. > > However, in the PR, @shoyer pointed out > > > The main challenge is to figure out how to transition the behavior of > all these operations, while preserving backwards compatibility. Quite > likely, we need to pick new names for these functions, though we should try > to pick something that doesn't suggest that they are second class > alternatives. > > Do we choose new function names? Do we add a keyword arg that changes what > np.inner returns? > > [a PR]:https://github.com/numpy/numpy/pull/7690 > > > > > ___ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > > ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Integers to integer powers, let's make a decision
On 6/4/2016 10:23 PM, Charles R Harris wrote: From my point of view, backwards compatibility is the main reason for choosing 1, otherwise I'd pick 2. If it weren't so easy to get floating point by using floating exponents I'd probably choose differently. As an interested user, I offer a summary of some things I believe are being claimed about the two proposals on the table (for int**int), which are are: 1. raise an error for negative powers 2. always return float Here is a first draft comparison (for int**int only) Proposal 1. effectively throws away an operator - true in this: np.arange(10)**10 already overflows even for int32 much less smaller sizes, and negative powers are now errors - fale in this: you can change an argument to float Proposal 1. effectively behaves more like Python - true in this: for a very small range of numbers, int**int will return int in Python 2 - false in this: In Python, negative exponents produce floats, and int**int does not overflow Proposal 1 is more backwards compatible: true, but this really only affects int**2 (larger arguments quickly overflow) Proposal 2 is a better match for other languages: basically true (see e.g., C++'s overloaded `pow`) Proposal 2 better satisfies the principle of least surprise: probably true for most users, possibly false for some Feel free to add, correct, modify. I think there is a strong argument to always return float, and the real question is whether it is strong enough tosacrifice backwards compatibility. Hope this summary is of some use and not too tendentious, Alan ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion