[Numpy-discussion] Re: New matvec and vecmat functions
> Could you please offer some code or math notation to help communicate this? > I am forced to guess at the need. > > The words "matrix" and "vector" are ambiguous. > After all, matrices (of given shape) are a type of vector (i.e., can be added > and scaled.) > So if by "matrix" you mean "2d array" and by "stack of vectors" you > effectively mean "2d array", > this sounds like a use for np.dot (possibly after a transpose). > However I am going to guess that here by "vector" you actually mean a matrix > (i.e., a 2d array) with only one row or only one column, so a "stack" of them > is actually 3d. Perhaps the needless dimension is then the real problem > and can either not be produced or can be squeezed away.. Stack of matrices in this context is a an ndarray in which the last two dimensions are interpreted as columns and rows of matrices (in that order), stack of vectors as an ndarray in which the last dimension is interpreted as column vectors. (Well, axes in all these functions can be chosen arbitrarily, but those are the defaults.) So a simple example for matvec would be a rotation matrix that I'd like to apply to a large number of vectors (say to points in space); this is currently not easy. Complex vectors might be Fourier components. (Note that I was rather sloppy in calling it multiplication rather than using the term vector-matrix product, etc.; it is definitely not element-wise!). The vector-matrix product comes up a bit less, but as mentioned by Evgeni in physics there is, e.g., the bra-ket stuff with Hamiltonians (<\psi | \hat H | \psi>), and in linear least squares one often gets terms which for real data are written as Xᵀ Z, but for complex would be Xᴴ Z [1] Hope this clarifies things! -- Marten [1] https://en.wikipedia.org/wiki/Linear_least_squares ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: New matvec and vecmat functions
> FWIW, +1 for matvec & vecmat to complement matmat (erm, matmul). Having a > binop where one argument is a matrix and the other is a > stack/batch of vectors is indeed awkward otherwise, and a dedicated function > to clearly distinguish "two matrices" from "a matrix and a > batch of vectors" sounds great from a user perspective. > > As to vecmat doing hermitian conjugation of the vector argument --- I'd be in > favor (because <\psi | \hat H | \psi>) but this is a weak > preference. Indeed, what wikipedia calls the "physics convention" [1] of both the vector dot product = x† y and vector matrix product x† A († = transpose-conjugate). -- Marten [1] https://en.wikipedia.org/wiki/Sesquilinear_form ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: New matvec and vecmat functions
On Wed, Jan 24, 2024 at 2:27 PM Marten van Kerkwijk wrote: > > Why do these belong in NumPy? What is the broad field of application of > these functions? And, > > does a more general concept underpin them? > > Multiplication of a matrix with a vector is about as common as matrix > with matrix or vector with vector, and not currently easy to do for > stacks of vectors, so I think the case for matvec is similarly strong as > that for matmul and vecdot. > > Arguably, vecmat is slightly less common, though completes the quad. > > -- Marten > > Could you please offer some code or math notation to help communicate this? I am forced to guess at the need. The words "matrix" and "vector" are ambiguous. After all, matrices (of given shape) are a type of vector (i.e., can be added and scaled.) So if by "matrix" you mean "2d array" and by "stack of vectors" you effectively mean "2d array", this sounds like a use for np.dot (possibly after a transpose). However I am going to guess that here by "vector" you actually mean a matrix (i.e., a 2d array) with only one row or only one column, so a "stack" of them is actually 3d. Perhaps the needless dimension is then the real problem and can either not be produced or can be squeezed away.. Thanks, Alan Isaac ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: New matvec and vecmat functions
On Wed, 24 Jan 2024 at 19:29, Marten van Kerkwijk wrote: > > > Why do these belong in NumPy? What is the broad field of application of > > these functions? And, > > does a more general concept underpin them? > > Multiplication of a matrix with a vector is about as common as matrix > with matrix or vector with vector, and not currently easy to do for > stacks of vectors, so I think the case for matvec is similarly strong as > that for matmul and vecdot. If either argument is conjugated then I would not describe this simply as "multiplication". A different terminology would be better (and perhaps different function names). -- Oscar ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: New matvec and vecmat functions
FWIW, +1 for matvec & vecmat to complement matmat (erm, matmul). Having a binop where one argument is a matrix and the other is a stack/batch of vectors is indeed awkward otherwise, and a dedicated function to clearly distinguish "two matrices" from "a matrix and a batch of vectors" sounds great from a user perspective. As to vecmat doing hermitian conjugation of the vector argument --- I'd be in favor (because <\psi | \hat H | \psi>) but this is a weak preference. ср, 24 янв. 2024 г., 21:28 Marten van Kerkwijk : > > Why do these belong in NumPy? What is the broad field of application of > these functions? And, > > does a more general concept underpin them? > > Multiplication of a matrix with a vector is about as common as matrix > with matrix or vector with vector, and not currently easy to do for > stacks of vectors, so I think the case for matvec is similarly strong as > that for matmul and vecdot. > > Arguably, vecmat is slightly less common, though completes the quad. > > -- Marten > > > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: evgeny.burovs...@gmail.com > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: New matvec and vecmat functions
> Why do these belong in NumPy? What is the broad field of application of these > functions? And, > does a more general concept underpin them? Multiplication of a matrix with a vector is about as common as matrix with matrix or vector with vector, and not currently easy to do for stacks of vectors, so I think the case for matvec is similarly strong as that for matmul and vecdot. Arguably, vecmat is slightly less common, though completes the quad. -- Marten ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: New matvec and vecmat functions
Why do these belong in NumPy? What is the broad field of application of these functions? And, does a more general concept underpin them? Thanks, Alan Isaac On Tue, Jan 23, 2024 at 5:17 PM Marten van Kerkwijk wrote: > Hi All, > > I have a PR [1] that adds `np.matvec` and `np.vecmat` gufuncs for > matrix-vector and vector-matrix calculations, to add to plain > matrix-matrix multiplication with `np.matmul` and the inner vector > product with `np.vecdot`. They call BLAS where possible for speed. > I'd like to hear whether these are good additions. > > I also note that for complex numbers, `vecmat` is defined as `x†A`, > i.e., the complex conjugate of the vector is taken. This seems to be the > standard and is what we used for `vecdot` too (`x†x`). However, it is > *not* what `matmul` does for vector-matrix or indeed vector-vector > products (remember that those are possible only if the vector is > one-dimensional, i.e., not with a stack of vectors). I think this is a > bug in matmul, which I'm happy to fix. But I'm posting here in part to > get feedback on that. > > Thanks! > > Marten > > [1] https://github.com/numpy/numpy/pull/25675 > > p.s. Separately, with these functions available, in principle these > could be used in `__matmul__` (and thus for `@`) and the specializations > in `np.matmul` removed. But that can be a separate PR (if it is wanted > at all). > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: alan.is...@gmail.com > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Proposal to accept NEP 55: Add a UTF-8 variable-width string DType to NumPy
On Wed, Jan 24, 2024 at 10:43 AM Sebastian Berg wrote: > On Mon, 2024-01-22 at 17:08 -0700, Nathan wrote: > > Hi all, > > > > I propose we accept NEP 55 and merge PR #25347 implementing the NEP > > in time > > for the NumPy 2.0 RC: > > > I really like this work and I think it is a big improvement! At this > point we probably have to expect some things to be still buggy, but > that is also a reason to get it in (testing is hard if it isn't shipped > first-class unfortunately). > +1 to this. It's seen a ton of hard and careful work for about a year now, and seems close to as ready as it's going to get pre-merging. So +1 to accepting the NEP now and hitting the green button on your main PR. Cheers, Ralf Nathan summarized the things I might have brought up very well. The > support of missing values is the one thing that to me may end up a bit > more in flux. > But I am happy to hope that this is in a way that pandas will not be > affected and, honestly, without deep integration testing we won't make > progress in figuring out whether there is some change needed or not. > > Thanks for the great work! > > - Sebastian > > > > > > https://numpy.org/neps/nep-0055-string_dtype.html > > https://github.com/numpy/numpy/pull/25347 > > > > The most controversial aspect of the NEP was support for missing > > strings > > via a user-supplied sentinel object. In the previous discussion on > > the > > mailing list, Warren Weckesser argued for shipping a missing data > > sentinel > > with NumPy for use with the DType, while in code review and the PR > > for the > > NEP, Sebestian expressed concern about the additional complexity of > > including missing data support at all. > > > > I found that supporting missing data is key to efficiently supporting > > the > > new DType in Pandas. I think that argues that we need some level of > > missing > > data support to fully replace object string arrays. I believe the > > compromise proposal in the NEP is sufficient for downstream libraries > > while > > limiting additional complexity elsewhere in NumPy. > > > > Concerns raised in previous discussions about concretely specifying > > the C > > API to be made public, preventing use-after-free errors in a > > multithreaded > > context, and uncertainty around the arena allocator implementation > > have > > been resolved in the latest version of the NEP and the open PR. > > Additionally, due to some excellent and timely work by Lysandros > > Nikolaou, > > we now have a number of string ufuncs in NumPy and a straightforward > > plan > > to add more. Loops have been implemented for all the ufuncs added in > > the > > NumPy 2.0 dev cycle so far. > > > > I would like to see us ship the DType in NumPy 2.0. This will allow > > us to > > advertise a major new feature, will spur efforts to support new > > DTypes in > > downstream libraries, and will allow us to get feedback from the > > community > > that would be difficult to obtain without releasing the code into the > > wild. > > Additionally, I am funded via a NASA ROSES grant for work related to > > this > > effort until the end of 2024, so including the DType in NumPy 2.0 > > will more > > efficiently use my funded time to fix issues. > > > > If there are no substantive objections to this email, then the NEP > > will be > > considered accepted; see NEP 0 for more details: > > https://numpy.org/neps/nep-.html > > ___ > > NumPy-Discussion mailing list -- numpy-discussion@python.org > > To unsubscribe send an email to numpy-discussion-le...@python.org > > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > > Member address: sebast...@sipsolutions.net > > > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: ralf.gomm...@googlemail.com > ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Enhancement: Chebyshev power using DCT
On Wed, Jan 24, 2024 at 6:29 AM Fabio Matti wrote: > Hi, > > In the `numpy.polynomial.chebyshev` module, the function for raising a > Chebyshev polynomial to a power, `chebpow` [1], is essentially implemented > in the following way: > > {{{#!highlight python > def chebpow(c, pow): > """Raise a Chebyshev series to a power.""" > zs = _cseries_to_zseries(c) > prd = zs > for i in range(2, pow + 1): > prd = np.convolve(prd, zs) > return _zseries_to_cseries(prd) > }}} > > For large coefficient arrays `c` and big exponents `pow`, this procedure > is not efficient. In fact, the complexity of this function is > `O(pow*len(c)^2)`, since the numpy convolution does not make use of a Fast > Fourier Transform (FFT). > > It is known that Chebyshev polynomials can be multiplied with Discrete > Cosine Transforms (DCT) [2]. What results is the following > algorithm`O(pow*len(c)*log(pow*len(c)))` algorithm for raising a Chebyshev > polynomial with coefficients `c` to an integer power: > > {{{#!highlight python > def chebpow_dct(c, pow): > """Raise a Chebyshev series to a power.""" > pad_length = (pow - 1) * (len(c) - 1) > c = np.pad(c, (0, pad_length)) > c[1:-1] /= 2 > c_pow = idct(dct(c) ** pow) > c_pow[1:-1] *= 2 > return c_pow > }}} > > The only issue I am having is that as far as I know, `numpy` (unlike > `scipy`) does not have a specialized implementation for the DCT. So the > only way of getting the code to work is "emulating" a DCT with two calls to > `numpy.fft.rfft`, which is slightly slower than using the `scipy.fft.dct`. > > I have created a Google colab notebook which compares the error and > runtime of the different implementations (current implementation, > implementation using `scipy.fft.dct`, and pure numpy implementation) [3]. > Especially for larger degree polynomials and higher powers this enhancement > would make a huge difference in terms of runtime. > > Similarly, `chebmul` and `chebinterpolate` can also be implemented more > efficiently by using a DCT. > > Do you think this enhancement is worth pursuing, and should I create a > pull-request for it? > > Best, > > Fabio > > [1] > https://github.com/numpy/numpy/blob/main/numpy/polynomial/chebyshev.py#L817 > [2] https://www.sciencedirect.com/science/article/pii/0024379595006966 > [3] > https://colab.research.google.com/drive/1JtDDeWC1CEQHDidZ9f5_Ma_ifoBv4Tuz?usp=sharing This is a tricky sort of decision. The various polynomial functions were designed so that they could be used with different types, Fractions, for instance, and for degrees less than about 50 max. They are sort of a cross between teaching tools and quick prototyping. I agree that for fast production code and big degrees, DCT is the way to go for Chebyshev. There are even some packages out there designed on that basis. I wouldn't complain if someone produced a package with restricted types that was more efficient than what NumPy has, indeed, types are already restricted for curve fitting. NumPy is trending towards specialization these days, I suspect a separate polynomial package would be a better place for such improvements. Chuck ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Enhancement: Chebyshev power using DCT
Hi, In the `numpy.polynomial.chebyshev` module, the function for raising a Chebyshev polynomial to a power, `chebpow` [1], is essentially implemented in the following way: {{{#!highlight python def chebpow(c, pow): """Raise a Chebyshev series to a power.""" zs = _cseries_to_zseries(c) prd = zs for i in range(2, pow + 1): prd = np.convolve(prd, zs) return _zseries_to_cseries(prd) }}} For large coefficient arrays `c` and big exponents `pow`, this procedure is not efficient. In fact, the complexity of this function is `O(pow*len(c)^2)`, since the numpy convolution does not make use of a Fast Fourier Transform (FFT). It is known that Chebyshev polynomials can be multiplied with Discrete Cosine Transforms (DCT) [2]. What results is the following algorithm`O(pow*len(c)*log(pow*len(c)))` algorithm for raising a Chebyshev polynomial with coefficients `c` to an integer power: {{{#!highlight python def chebpow_dct(c, pow): """Raise a Chebyshev series to a power.""" pad_length = (pow - 1) * (len(c) - 1) c = np.pad(c, (0, pad_length)) c[1:-1] /= 2 c_pow = idct(dct(c) ** pow) c_pow[1:-1] *= 2 return c_pow }}} The only issue I am having is that as far as I know, `numpy` (unlike `scipy`) does not have a specialized implementation for the DCT. So the only way of getting the code to work is "emulating" a DCT with two calls to `numpy.fft.rfft`, which is slightly slower than using the `scipy.fft.dct`. I have created a Google colab notebook which compares the error and runtime of the different implementations (current implementation, implementation using `scipy.fft.dct`, and pure numpy implementation) [3]. Especially for larger degree polynomials and higher powers this enhancement would make a huge difference in terms of runtime. Similarly, `chebmul` and `chebinterpolate` can also be implemented more efficiently by using a DCT. Do you think this enhancement is worth pursuing, and should I create a pull-request for it? Best, Fabio [1] https://github.com/numpy/numpy/blob/main/numpy/polynomial/chebyshev.py#L817 [2] https://www.sciencedirect.com/science/article/pii/0024379595006966 [3] https://colab.research.google.com/drive/1JtDDeWC1CEQHDidZ9f5_Ma_ifoBv4Tuz?usp=sharing ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com
[Numpy-discussion] Re: Proposal to accept NEP 55: Add a UTF-8 variable-width string DType to NumPy
On Mon, 2024-01-22 at 17:08 -0700, Nathan wrote: > Hi all, > > I propose we accept NEP 55 and merge PR #25347 implementing the NEP > in time > for the NumPy 2.0 RC: I really like this work and I think it is a big improvement! At this point we probably have to expect some things to be still buggy, but that is also a reason to get it in (testing is hard if it isn't shipped first-class unfortunately). Nathan summarized the things I might have brought up very well. The support of missing values is the one thing that to me may end up a bit more in flux. But I am happy to hope that this is in a way that pandas will not be affected and, honestly, without deep integration testing we won't make progress in figuring out whether there is some change needed or not. Thanks for the great work! - Sebastian > > https://numpy.org/neps/nep-0055-string_dtype.html > https://github.com/numpy/numpy/pull/25347 > > The most controversial aspect of the NEP was support for missing > strings > via a user-supplied sentinel object. In the previous discussion on > the > mailing list, Warren Weckesser argued for shipping a missing data > sentinel > with NumPy for use with the DType, while in code review and the PR > for the > NEP, Sebestian expressed concern about the additional complexity of > including missing data support at all. > > I found that supporting missing data is key to efficiently supporting > the > new DType in Pandas. I think that argues that we need some level of > missing > data support to fully replace object string arrays. I believe the > compromise proposal in the NEP is sufficient for downstream libraries > while > limiting additional complexity elsewhere in NumPy. > > Concerns raised in previous discussions about concretely specifying > the C > API to be made public, preventing use-after-free errors in a > multithreaded > context, and uncertainty around the arena allocator implementation > have > been resolved in the latest version of the NEP and the open PR. > Additionally, due to some excellent and timely work by Lysandros > Nikolaou, > we now have a number of string ufuncs in NumPy and a straightforward > plan > to add more. Loops have been implemented for all the ufuncs added in > the > NumPy 2.0 dev cycle so far. > > I would like to see us ship the DType in NumPy 2.0. This will allow > us to > advertise a major new feature, will spur efforts to support new > DTypes in > downstream libraries, and will allow us to get feedback from the > community > that would be difficult to obtain without releasing the code into the > wild. > Additionally, I am funded via a NASA ROSES grant for work related to > this > effort until the end of 2024, so including the DType in NumPy 2.0 > will more > efficiently use my funded time to fix issues. > > If there are no substantive objections to this email, then the NEP > will be > considered accepted; see NEP 0 for more details: > https://numpy.org/neps/nep-.html > ___ > NumPy-Discussion mailing list -- numpy-discussion@python.org > To unsubscribe send an email to numpy-discussion-le...@python.org > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ > Member address: sebast...@sipsolutions.net ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3/lists/numpy-discussion.python.org/ Member address: arch...@mail-archive.com