Hello,

I would like to propose adding the `out` array as an optional parameter to 
`bincount`.  This makes `bincount` very useful when iteratively tallying data 
with large indices.

Consider this example tallying batches of values from some fictional source of 
data:

>>> tally = np.zeros(10000**2)
>>> for indices, weights in read_sensor_data():
...    tally += np.bincount(indices, weights, 10000**2)  # slow: repeatedly 
adding large arrays

This could be trivially sped up:

>>> tally = np.zeros(10000**2)
>>> for indices, weights in read_sensor_data():
...    np.bincount(indices, weights, out=tally)  # fast: plain sum-loop in C

As far as I can see, there is no equivalent numpy functionality. In fact, as 
far as I'm aware, there isn't any fast alternative outside of 
C/Cython/numba/... It also fits the purpose of `bincount` nicely, and does not 
change existing functionality. One might argue about the exact semantics if 
both `minlength` and `out` are given, but I think that a sensible answer exists 
in requiring `len(out) >= max(list.max(), minlength)`.
_______________________________________________
NumPy-Discussion mailing list -- numpy-discussion@python.org
To unsubscribe send an email to numpy-discussion-le...@python.org
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: arch...@mail-archive.com

Reply via email to