[Numpy-discussion] Re: New matvec and vecmat functions

2024-01-24 Thread Marten van Kerkwijk
> Could you please offer some code or math notation to help communicate this?
> I am forced to guess at the need.
>
> The words "matrix" and "vector" are ambiguous.
> After all, matrices (of given shape) are a type of vector (i.e., can be added 
> and scaled.)
> So if by "matrix" you mean "2d array" and by "stack of vectors" you 
> effectively mean "2d array",
> this sounds like a use for np.dot (possibly after a transpose).
> However I am going to guess that here by "vector" you actually mean a matrix
> (i.e., a 2d array) with only one row or only one column, so a "stack" of them
> is actually 3d. Perhaps the needless dimension is then the real problem
> and can either not be produced or can be squeezed away..

Stack of matrices in this context is a an ndarray in which the last two
dimensions are interpreted as columns and rows of matrices (in that
order), stack of vectors as an ndarray in which the last dimension is
interpreted as column vectors.  (Well, axes in all these functions can
be chosen arbitrarily, but those are the defaults.)

So a simple example for matvec would be a rotation matrix that I'd like
to apply to a large number of vectors (say to points in space); this is
currently not easy.  Complex vectors might be Fourier components.  (Note
that I was rather sloppy in calling it multiplication rather than using
the term vector-matrix product, etc.; it is definitely not element-wise!).

The vector-matrix product comes up a bit less, but as mentioned by
Evgeni in physics there is, e.g., the bra-ket stuff with Hamiltonians
(<\psi | \hat H | \psi>), and in linear least squares one often gets
terms which for real data are written as Xᵀ Z, but for complex would be
Xᴴ Z [1]

Hope this clarifies things!

-- Marten

[1] https://en.wikipedia.org/wiki/Linear_least_squares
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: New matvec and vecmat functions

2024-01-24 Thread Marten van Kerkwijk
> FWIW, +1 for matvec & vecmat to complement matmat (erm, matmul). Having a 
> binop where one argument is a matrix and the other is a
> stack/batch of vectors is indeed awkward otherwise, and a dedicated function 
> to clearly distinguish "two matrices" from "a matrix and a
> batch of vectors" sounds great from a user perspective.
>
> As to vecmat doing hermitian conjugation of the vector argument --- I'd be in 
> favor (because <\psi | \hat H | \psi>) but this is a weak
> preference.

Indeed, what wikipedia calls the "physics convention" [1] of both the
vector dot product  = x† y and vector matrix product x† A († =
transpose-conjugate).

-- Marten

[1] https://en.wikipedia.org/wiki/Sesquilinear_form
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: New matvec and vecmat functions

2024-01-24 Thread Alan
On Wed, Jan 24, 2024 at 2:27 PM Marten van Kerkwijk 
wrote:

> > Why do these belong in NumPy? What is the broad field of application of
> these functions? And,
> > does a more general concept underpin them?
>
> Multiplication of a matrix with a vector is about as common as matrix
> with matrix or vector with vector, and not currently easy to do for
> stacks of vectors, so I think the case for matvec is similarly strong as
> that for matmul and vecdot.
>
> Arguably, vecmat is slightly less common, though completes the quad.
>
> -- Marten
>
>



Could you please offer some code or math notation to help communicate this?
I am forced to guess at the need.

The words "matrix" and "vector" are ambiguous.
After all, matrices (of given shape) are a type of vector (i.e., can be
added and scaled.)
So if by "matrix" you mean "2d array" and by "stack of vectors" you
effectively mean "2d array",
this sounds like a use for np.dot (possibly after a transpose).
However I am going to guess that here by "vector" you actually mean a matrix
(i.e., a 2d array) with only one row or only one column, so a "stack" of
them
is actually 3d. Perhaps the needless dimension is then the real problem
and can either not be produced or can be squeezed away..

Thanks, Alan Isaac
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: New matvec and vecmat functions

2024-01-24 Thread Oscar Benjamin
On Wed, 24 Jan 2024 at 19:29, Marten van Kerkwijk
 wrote:
>
> > Why do these belong in NumPy? What is the broad field of application of 
> > these functions? And,
> > does a more general concept underpin them?
>
> Multiplication of a matrix with a vector is about as common as matrix
> with matrix or vector with vector, and not currently easy to do for
> stacks of vectors, so I think the case for matvec is similarly strong as
> that for matmul and vecdot.

If either argument is conjugated then I would not describe this simply
as "multiplication". A different terminology would be better (and
perhaps different function names).

--
Oscar
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: New matvec and vecmat functions

2024-01-24 Thread Evgeni Burovski
FWIW, +1 for matvec & vecmat to complement matmat (erm, matmul). Having a
binop where one argument is a matrix and the other is a stack/batch of
vectors is indeed awkward otherwise, and a dedicated function to clearly
distinguish "two matrices" from "a matrix and a batch of vectors" sounds
great from a user perspective.

As to vecmat doing hermitian conjugation of the vector argument --- I'd be
in favor (because <\psi | \hat H | \psi>) but this is a weak preference.

ср, 24 янв. 2024 г., 21:28 Marten van Kerkwijk :

> > Why do these belong in NumPy? What is the broad field of application of
> these functions? And,
> > does a more general concept underpin them?
>
> Multiplication of a matrix with a vector is about as common as matrix
> with matrix or vector with vector, and not currently easy to do for
> stacks of vectors, so I think the case for matvec is similarly strong as
> that for matmul and vecdot.
>
> Arguably, vecmat is slightly less common, though completes the quad.
>
> -- Marten
>
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: evgeny.burovs...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: New matvec and vecmat functions

2024-01-24 Thread Marten van Kerkwijk
> Why do these belong in NumPy? What is the broad field of application of these 
> functions? And,
> does a more general concept underpin them?

Multiplication of a matrix with a vector is about as common as matrix
with matrix or vector with vector, and not currently easy to do for
stacks of vectors, so I think the case for matvec is similarly strong as
that for matmul and vecdot.

Arguably, vecmat is slightly less common, though completes the quad.

-- Marten


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: New matvec and vecmat functions

2024-01-24 Thread Alan
Why do these belong in NumPy? What is the broad field of application of
these functions? And, does a more general concept underpin them?
Thanks, Alan Isaac

On Tue, Jan 23, 2024 at 5:17 PM Marten van Kerkwijk 
wrote:

> Hi All,
>
> I have a PR [1] that adds `np.matvec` and `np.vecmat` gufuncs for
> matrix-vector and vector-matrix calculations, to add to plain
> matrix-matrix multiplication with `np.matmul` and the inner vector
> product with `np.vecdot`.  They call BLAS where possible for speed.
> I'd like to hear whether these are good additions.
>
> I also note that for complex numbers, `vecmat` is defined as `x†A`,
> i.e., the complex conjugate of the vector is taken. This seems to be the
> standard and is what we used for `vecdot` too (`x†x`). However, it is
> *not* what `matmul` does for vector-matrix or indeed vector-vector
> products (remember that those are possible only if the vector is
> one-dimensional, i.e., not with a stack of vectors). I think this is a
> bug in matmul, which I'm happy to fix. But I'm posting here in part to
> get feedback on that.
>
> Thanks!
>
> Marten
>
> [1] https://github.com/numpy/numpy/pull/25675
>
> p.s. Separately, with these functions available, in principle these
> could be used in `__matmul__` (and thus for `@`) and the specializations
> in `np.matmul` removed. But that can be a separate PR (if it is wanted
> at all).
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: alan.is...@gmail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Proposal to accept NEP 55: Add a UTF-8 variable-width string DType to NumPy

2024-01-24 Thread Ralf Gommers
On Wed, Jan 24, 2024 at 10:43 AM Sebastian Berg 
wrote:

> On Mon, 2024-01-22 at 17:08 -0700, Nathan wrote:
> > Hi all,
> >
> > I propose we accept NEP 55 and merge PR #25347 implementing the NEP
> > in time
> > for the NumPy 2.0 RC:
>
>
> I really like this work and I think it is a big improvement!  At this
> point we probably have to expect some things to be still buggy, but
> that is also a reason to get it in (testing is hard if it isn't shipped
> first-class unfortunately).
>

+1 to this. It's seen a ton of hard and careful work for about a year now,
and seems close to as ready as it's going to get pre-merging. So +1 to
accepting the NEP now and hitting the green button on your main PR.

Cheers,
Ralf


Nathan summarized the things I might have brought up very well.  The
> support of missing values is the one thing that to me may end up a bit
> more in flux.
> But I am happy to hope that this is in a way that pandas will not be
> affected and, honestly, without deep integration testing we won't make
> progress in figuring out whether there is some change needed or not.
>
> Thanks for the great work!
>
> - Sebastian
>
>
> >
> > https://numpy.org/neps/nep-0055-string_dtype.html
> > https://github.com/numpy/numpy/pull/25347
> >
> > The most controversial aspect of the NEP was support for missing
> > strings
> > via a user-supplied sentinel object. In the previous discussion on
> > the
> > mailing list, Warren Weckesser argued for shipping a missing data
> > sentinel
> > with NumPy for use with the DType, while in code review and the PR
> > for the
> > NEP, Sebestian expressed concern about the additional complexity of
> > including missing data support at all.
> >
> > I found that supporting missing data is key to efficiently supporting
> > the
> > new DType in Pandas. I think that argues that we need some level of
> > missing
> > data support to fully replace object string arrays. I believe the
> > compromise proposal in the NEP is sufficient for downstream libraries
> > while
> > limiting additional complexity elsewhere in NumPy.
> >
> > Concerns raised in previous discussions about concretely specifying
> > the C
> > API to be made public, preventing use-after-free errors in a
> > multithreaded
> > context, and uncertainty around the arena allocator implementation
> > have
> > been resolved in the latest version of the NEP and the open PR.
> > Additionally, due to some excellent and timely work by Lysandros
> > Nikolaou,
> > we now have a number of string ufuncs in NumPy and a straightforward
> > plan
> > to add more. Loops have been implemented for all the ufuncs added in
> > the
> > NumPy 2.0 dev cycle so far.
> >
> > I would like to see us ship the DType in NumPy 2.0. This will allow
> > us to
> > advertise a major new feature, will spur efforts to support new
> > DTypes in
> > downstream libraries, and will allow us to get feedback from the
> > community
> > that would be difficult to obtain without releasing the code into the
> > wild.
> > Additionally, I am funded via a NASA ROSES grant for work related to
> > this
> > effort until the end of 2024, so including the DType in NumPy 2.0
> > will more
> > efficiently use my funded time to fix issues.
> >
> > If there are no substantive objections to this email, then the NEP
> > will be
> > considered accepted; see NEP 0 for more details:
> > https://numpy.org/neps/nep-.html
> > ___
> > NumPy-Discussion mailing list -- numpy-discussion@python.org
> > To unsubscribe send an email to numpy-discussion-le...@python.org
> > https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> > Member address: sebast...@sipsolutions.net
>
>
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: ralf.gomm...@googlemail.com
>
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Enhancement: Chebyshev power using DCT

2024-01-24 Thread Charles R Harris
On Wed, Jan 24, 2024 at 6:29 AM Fabio Matti 
wrote:

> Hi,
>
> In the `numpy.polynomial.chebyshev` module, the function for raising a
> Chebyshev polynomial to a power, `chebpow` [1], is essentially implemented
> in the following way:
>
> {{{#!highlight python
> def chebpow(c, pow):
> """Raise a Chebyshev series to a power."""
> zs = _cseries_to_zseries(c)
> prd = zs
> for i in range(2, pow + 1):
> prd = np.convolve(prd, zs)
> return _zseries_to_cseries(prd)
> }}}
>
> For large coefficient arrays `c` and big exponents `pow`, this procedure
> is not efficient. In fact, the complexity of this function is
> `O(pow*len(c)^2)`, since the numpy convolution does not make use of a Fast
> Fourier Transform (FFT).
>
> It is known that Chebyshev polynomials can be multiplied with Discrete
> Cosine Transforms (DCT) [2]. What results is the following
> algorithm`O(pow*len(c)*log(pow*len(c)))` algorithm for raising a Chebyshev
> polynomial with coefficients `c` to an integer power:
>
> {{{#!highlight python
> def chebpow_dct(c, pow):
> """Raise a Chebyshev series to a power."""
> pad_length = (pow - 1) * (len(c) - 1)
> c = np.pad(c, (0, pad_length))
> c[1:-1] /= 2
> c_pow = idct(dct(c) ** pow)
> c_pow[1:-1] *= 2
> return c_pow
> }}}
>
> The only issue I am having is that as far as I know, `numpy` (unlike
> `scipy`) does not have a specialized implementation for the DCT. So the
> only way of getting the code to work is "emulating" a DCT with two calls to
> `numpy.fft.rfft`, which is slightly slower than  using the `scipy.fft.dct`.
>
> I have created a Google colab notebook which compares the error and
> runtime of the different implementations (current implementation,
> implementation using `scipy.fft.dct`, and pure numpy implementation) [3].
> Especially for larger degree polynomials and higher powers this enhancement
> would make a huge difference in terms of runtime.
>
> Similarly, `chebmul` and `chebinterpolate` can also be implemented more
> efficiently by using a DCT.
>
> Do you think this enhancement is worth pursuing, and should I create a
> pull-request for it?
>
> Best,
>
> Fabio
>
> [1]
> https://github.com/numpy/numpy/blob/main/numpy/polynomial/chebyshev.py#L817
> [2] https://www.sciencedirect.com/science/article/pii/0024379595006966
> [3]
> https://colab.research.google.com/drive/1JtDDeWC1CEQHDidZ9f5_Ma_ifoBv4Tuz?usp=sharing


This is a tricky sort of decision. The various polynomial functions were
designed so that they could be used with different types, Fractions, for
instance, and for degrees less than about 50 max. They are sort of a cross
between teaching tools and quick prototyping. I agree that for fast
production code and big degrees, DCT is the way to go for Chebyshev. There
are even some packages out there designed on that basis. I wouldn't
complain if someone produced a package with restricted types that was more
efficient than what NumPy has, indeed, types are already restricted for
curve fitting. NumPy is trending towards specialization these days, I
suspect a separate polynomial package would be a better place for such
improvements.

Chuck
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Enhancement: Chebyshev power using DCT

2024-01-24 Thread Fabio Matti
Hi,

In the `numpy.polynomial.chebyshev` module, the function for raising a 
Chebyshev polynomial to a power, `chebpow` [1], is essentially implemented in 
the following way:

{{{#!highlight python
def chebpow(c, pow):
"""Raise a Chebyshev series to a power."""
zs = _cseries_to_zseries(c)
prd = zs
for i in range(2, pow + 1):
prd = np.convolve(prd, zs)
return _zseries_to_cseries(prd)
}}}

For large coefficient arrays `c` and big exponents `pow`, this procedure is not 
efficient. In fact, the complexity of this function is `O(pow*len(c)^2)`, since 
the numpy convolution does not make use of a Fast Fourier Transform (FFT).

It is known that Chebyshev polynomials can be multiplied with Discrete Cosine 
Transforms (DCT) [2]. What results is the following 
algorithm`O(pow*len(c)*log(pow*len(c)))` algorithm for raising a Chebyshev 
polynomial with coefficients `c` to an integer power:

{{{#!highlight python
def chebpow_dct(c, pow):
"""Raise a Chebyshev series to a power."""
pad_length = (pow - 1) * (len(c) - 1)
c = np.pad(c, (0, pad_length))
c[1:-1] /= 2
c_pow = idct(dct(c) ** pow)
c_pow[1:-1] *= 2
return c_pow
}}}

The only issue I am having is that as far as I know, `numpy` (unlike `scipy`) 
does not have a specialized implementation for the DCT. So the only way of 
getting the code to work is "emulating" a DCT with two calls to 
`numpy.fft.rfft`, which is slightly slower than  using the `scipy.fft.dct`.

I have created a Google colab notebook which compares the error and runtime of 
the different implementations (current implementation, implementation using 
`scipy.fft.dct`, and pure numpy implementation) [3]. Especially for larger 
degree polynomials and higher powers this enhancement would make a huge 
difference in terms of runtime.

Similarly, `chebmul` and `chebinterpolate` can also be implemented more 
efficiently by using a DCT. 

Do you think this enhancement is worth pursuing, and should I create a 
pull-request for it?

Best,

Fabio

[1] https://github.com/numpy/numpy/blob/main/numpy/polynomial/chebyshev.py#L817
[2] https://www.sciencedirect.com/science/article/pii/0024379595006966
[3] 
https://colab.research.google.com/drive/1JtDDeWC1CEQHDidZ9f5_Ma_ifoBv4Tuz?usp=sharing
___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com


[Numpy-discussion] Re: Proposal to accept NEP 55: Add a UTF-8 variable-width string DType to NumPy

2024-01-24 Thread Sebastian Berg
On Mon, 2024-01-22 at 17:08 -0700, Nathan wrote:
> Hi all,
> 
> I propose we accept NEP 55 and merge PR #25347 implementing the NEP
> in time
> for the NumPy 2.0 RC:


I really like this work and I think it is a big improvement!  At this
point we probably have to expect some things to be still buggy, but
that is also a reason to get it in (testing is hard if it isn't shipped
first-class unfortunately).

Nathan summarized the things I might have brought up very well.  The 
support of missing values is the one thing that to me may end up a bit
more in flux.
But I am happy to hope that this is in a way that pandas will not be
affected and, honestly, without deep integration testing we won't make
progress in figuring out whether there is some change needed or not.

Thanks for the great work!

- Sebastian


> 
> https://numpy.org/neps/nep-0055-string_dtype.html
> https://github.com/numpy/numpy/pull/25347
> 
> The most controversial aspect of the NEP was support for missing
> strings
> via a user-supplied sentinel object. In the previous discussion on
> the
> mailing list, Warren Weckesser argued for shipping a missing data
> sentinel
> with NumPy for use with the DType, while in code review and the PR
> for the
> NEP, Sebestian expressed concern about the additional complexity of
> including missing data support at all.
> 
> I found that supporting missing data is key to efficiently supporting
> the
> new DType in Pandas. I think that argues that we need some level of
> missing
> data support to fully replace object string arrays. I believe the
> compromise proposal in the NEP is sufficient for downstream libraries
> while
> limiting additional complexity elsewhere in NumPy.
> 
> Concerns raised in previous discussions about concretely specifying
> the C
> API to be made public, preventing use-after-free errors in a
> multithreaded
> context, and uncertainty around the arena allocator implementation
> have
> been resolved in the latest version of the NEP and the open PR.
> Additionally, due to some excellent and timely work by Lysandros
> Nikolaou,
> we now have a number of string ufuncs in NumPy and a straightforward
> plan
> to add more. Loops have been implemented for all the ufuncs added in
> the
> NumPy 2.0 dev cycle so far.
> 
> I would like to see us ship the DType in NumPy 2.0. This will allow
> us to
> advertise a major new feature, will spur efforts to support new
> DTypes in
> downstream libraries, and will allow us to get feedback from the
> community
> that would be difficult to obtain without releasing the code into the
> wild.
> Additionally, I am funded via a NASA ROSES grant for work related to
> this
> effort until the end of 2024, so including the DType in NumPy 2.0
> will more
> efficiently use my funded time to fix issues.
> 
> If there are no substantive objections to this email, then the NEP
> will be
> considered accepted; see NEP 0 for more details:
> https://numpy.org/neps/nep-.html
> ___
> NumPy-Discussion mailing list -- numpy-discussion@python.org
> To unsubscribe send an email to numpy-discussion-le...@python.org
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: sebast...@sipsolutions.net


___
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com