On Sat, Mar 26, 2016 at 9:54 PM, Joseph Fox-Rabinovitz <jfoxrabinov...@gmail.com> wrote: > Would it make sense to just make the output type large enough to hold the > cumulative sum of the weights? > > > - Joseph Fox-Rabinovitz > > ------ Original message------ > > From: Jaime Fernández del Río > > Date: Sat, Mar 26, 2016 16:16 > > To: Discussion of Numerical Python; > > Subject:[Numpy-discussion] Make np.bincount output same dtype as weights > > Hi all, > > I have just submitted a PR (#7464) that fixes an enhancement request > (#6854), making np.bincount return an array of the same type as the weights > parameter. This is an important deviation from current behavior, which > always casts weights to double, and always returns a double array, so I > would like to hear what others think about the worthiness of this. Main > discussion points: > > np.bincount now works with complex weights (yay!), I guess this should be a > pretty uncontroversial enhancement. > The return is of the same type as weights, which means that small integers > are very likely to overflow. This is exactly what #6854 requested, but > perhaps we should promote the output for integers to a long, as we do in > np.sum?
I always thought of bincount with weights just as a group-by sum. So it would be easier to remember and have fewer surprises if it matches the behavior of np.sum. > Boolean arrays stay boolean, and OR, rather than sum, the weights. Is this > what one would want? If we decide that integer promotion is the way to go, > perhaps booleans should go in the same pack? Isn't this calculating the sum, i.e. count of True by group, already? Based on a quick example with numpy 1.9.2, I don't think I ever used bool weights before. > This new implementation currently supports all of the reasonable native > types, but has no fallback for user defined types. I guess we should > attempt to cast the array to double as before if no native loop can be > found? It would be good to have a way of testing this though, any thoughts > on how to go about this? > Does a behavior change like this require some deprecation period? What would > that look like? > I have also added broadcasting of weights to the full size of list, so that > one can do e.g. np.bincount([1, 2, 3], weights=2j) without having to tile > the single weight to the size of the bins list. > > Any other thoughts are very welcome as well! (2-D weights ?) Josef > > Jaime > > -- > (__/) > ( O.o) > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de > dominación mundial. > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > https://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org https://mail.scipy.org/mailman/listinfo/numpy-discussion