On Sat, 2024-11-23 at 20:03 -0500, Marten van Kerkwijk wrote:
> Hi All,
> 
> This discussion about updating reduceat went silent, but recently I
> came
> back to my PR to allow `indices` to be a 2-dimensional array of start
> and stop values (or a tuple of separate start and stop arrays).  I
> thought a bit more about it and think it is the easiest way to extend
> the present definition.  So, I have added some tests and
> documentation
> and would now like to open it for proper discussion.  See
> 
> https://github.com/numpy/numpy/pull/25476


And another try much later :).

I think we would be nice to revive this old PR. This discussion (and
more, plus my old attempts to use it a long time ago) make me convinced
that any move forward would be nice.

But my opinion on the precise direction changed a bit :).

I would prefer to introduce a new `ufunc.segmented_reduce` or
`reduce_segmented`.
My reasoning is that I think it is more descriptive and overloading
`reduceat` with multiple ways of using it seems potentially confusing
and to me seems awkward long term. A new ufunc method seems cheap API
surface wise.

This word probably comes from the HPC world maybe (e.g. CUDA cub uses
it).  One caveat is that it got adopted as `segmented_sum` into e.g.
JAX via tensorflow but they at least have an API that doesn't quite
look like a segmented reduce anymore (more a `ufunc.at`/map
reduce/reduce by key, although possibly with limitations making it
maybe more an implementation detail).

So if anyone has thoughts on this, I would be interested! And otherwise
it's a heads up that I think we may push for this, or the the current
overloading of `reduceat`.

Cheers,

Sebastian



> > From the examples there:
> ```
> a = np.arange(12)
> np.add.reduceat(a, ([1, 3, 5], [2, -1, 0]))
> # array([ 1, 52,  0])
> np.minimum.reduceat(a, ([1, 3, 5], [2, -1, 0]), initial=10)
> # array([ 1,  3, 10])
> np.minimum.reduceat(a, ([1, 3, 5], [2, -1, 0]))
> # ValueError: empty slice encountered with reduceat operation for
> 'minimum', which does not have an identity. Specify 'initial'.
> ```
> Let me know what you all think,
> 
> Marten
> 
> p.s.  Rereading the thread, I see we discussed initial vs default
> values
> in some detail. This is interesting, but somewhat orthogonal to the
> PR,
> since it just copies behaviour already present for reduce.
> _______________________________________________
> NumPy-Discussion mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
> Member address: [email protected]
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]

Reply via email to