[Numpy-discussion] Re: Add to NumPy a function to compute cumulative sums from 0.

Ilhan Polat Tue, 15 Aug 2023 06:35:09 -0700

On Tue, Aug 15, 2023 at 2:44 PM <[email protected]> wrote:


> > From my point of view, such function is a bit of a corner-case to be
> added to numpy. And it doesn’t justify it’s naming anymore. It is not one
> operation anymore. It is a cumsum and prepending 0. And it is very
> difficult to argue why prepending 0 to cumsum is a part of cumsum.
>
> That is backwards. Consider the array [x0, x1, x2].
>
> The sum of the first 0 elements is 0.
> The sum of the first 1 elements is x0.
> The sum of the first 2 elements is x0+x1.
> The sum of the first 3 elements is x0+x1+x2.
>
> Hence, the array of partial sums is [0, x0, x0+x1, x0+x1+x2].
>
> Thus, the operation [x0, x1, x2] -> [0, x0, x0+x1, x0+x1+x2] is a natural
> and primitive one.
>
>
You are describing ndarray.sum() behavior here inside an array as
intermediate results; sum is an aggregator that produces single item from a
list of items. Then you can argue about missing items behavior and the
values you have provided are exactly the values the accumulator would get.
However, cumsum, cumprod, diff etc. are "array functions". In other words
they provide fast vectorized access to otherwise laborious for loops. You
have to consider the equivalent for loops working on the array *data*, not
the ideal math framework over the number field. You don't start with the
array element that is before the first element for an array function hence
no elements -> 0 is only applicable to sum but not to the array function.
Or at least that would be my argument.

If you have no element meaning 0 elements the cumulative sum is not 0, it
is the empty array. Because there is no array to cumulatively "sum"
(remember we are working on the array to generate another array, not
aggregating). You can argue what empty set translates to under summation
etc. but I don't think it applies here. But that's my opinion. I'm not sure
why folks wanted to have this at all. It is the same as asking whether this
code

for k in range(0):
    ...some code ...

should at least spin once (fortran-ish behavior). I don't know why it
should. But then again, it becomes a bikeshedding with some conflicting
idealistic mathy axioms thrown at each other.

NumPy cumsum returns empty array for empty array (I think all software does
this including matlab). ndarray.sum() however returns scalar 0 (and I think
most software does this too), because that's pretty much a no-op over the
initialization value and aggregated, in the example above

x=0
for k in range(0):
    x += 1
return x # returns 0

I think all these point to the missing convenient functionality that
extends arrays. In matlab "[0 arr 10]" nicely extends the array to a new
one but in NumPy you need to punch quite some code and some courage to
remember whether it is hstack or vstack or concat or block as the correct
naming which decreases the "code morale". So if people want to quickly
extend arrays they either have to change the code for their needs or create
larger arrays which is pretty much #6044. So I think this is a feature
request of "prepend", "append" in a convenient fashion not to ufuncs but to
ndarray. Because concatenation is just pain in NumPy and ubiquitous
operation all around. Hence probably we should get a decision on that
instead of discussing each case separately.

_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: [email protected]

[Numpy-discussion] Re: Add to NumPy a function to compute cumulative sums from 0.

Reply via email to