[Numpy-discussion] Enhancement for generalized ufuncs

Warren Weckesser Thu, 11 Jul 2024 16:37:26 -0700

I have implemented quite a few generalized ufuncs over in ufunclab
(https://github.com/WarrenWeckesser/ufunclab), and in the process I
have accumulated a gufunc "wish list". Two items on that list are:


(1) the ability to impose constraints on the core dimensions that are
checked when the gufunc is called. By far the most common use-case I
have is requiring that a dimension have length at least 1. To do this
currently, I check the shapes in the ufunc loop function, and if they
are not valid, raise an exception and hope that the gufunc machinery
processes it as expected when the loop function returns. (Sorry, I'm
using lingo--"loop function", "core dimension", etc--that will be
familiar to those who already know the ufunc C API, but not so
familiar to general users of NumPy.)

(2) the ability to have the output dimension be a function of the
input dimensions, instead of being limited to one of the input
dimensions or an independent dimension. Then one could create, for
example, a 1-d convolution gufunc with shape signature that is
effectively `(m),(n)->(m + n - 1)` (corresponding to `mode='full'` in
`np.convolve`) and the gufunc code would automatically allocate the
output with the correct shape and dtype.

I have proposed a change in https://github.com/numpy/numpy/pull/26908
that makes both these features possible. A field is added to the
PyUFuncObject that is an optional user-defined C function that the
gufunc author implements. When a gufunc is called, this function is
called with an array of the values of the core dimensions of the input
and output arrays. Some or all of the output core dimensions might be
-1, meaning the arrays are to be allocated by the gufunc/iterator
machinery.  The new "hook" allows the user to check the given core
dimensions and raise an exception if some constraint is not satisfied.
The user-defined function can also replace those -1 values with sizes
that it computes based on the given input core dimensions.

To define the 1-d convolution gufunc, the actual shape signature that
is passed to `PyUFunc_FromFuncAndDataAndSignature` is `(m),(n)->(p)`.
When a user passes arrays with shapes, say, (20,) and (30,) as the
input and with no output array specified, the user-defined function
will get the array [20, 30, -1]. It would replace -1 with m + n - 1 =
49 and return. If the caller *does* include an output array in the
call, the core dimension of that array will be the third element of
the array passed to the user-defined function. In that case, the
function verifies that the value equals m + n - 1, and raises an
exception if it doesn't.

Here's that 1-d convolution, called `conv1d_full` here, in action:

```
In [14]: import numpy as np

In [15]: from experiment import conv1d_full

In [16]: type(conv1d_full)
Out[16]: numpy.ufunc
```

`m = 4`, `n = 6`, so the output shape is `p = m + n - 1 = 9`:

```
In [17]: conv1d_full([1, 2, 3, 4], [-1, 1, 2, 1.5, -2, 1])
Out[17]: array([-1. , -1. , 1. , 4.5, 11. , 9.5, 2. , -5. , 4. ])
```

Standard broadcasting:

```
In [18]: conv1d_full([[1, 2, 3, 4], [0.5, 0, -1, 1]], [-1, 1, 2, 1.5, -2, 1])
Out[18]:
array([[-1. , -1. , 1. , 4.5 , 11. , 9.5 , 2. , -5. , 4. ],
    [-0.5 , 0.5 , 2. , -1.25, -2. , 1. , 3.5 , -3. , 1. ]])
```

Comments here or over in the pull request are welcome. The essential
changes to the source code are just 7 lines in `ufunc_object.c` and 7
lines in `ufuncobject.h`. The rest of the changes in the PR create a
couple gufuncs that use the new feature, with corresponding unit
tests.

Warren
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

[Numpy-discussion] Enhancement for generalized ufuncs

Reply via email to