Re: [Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-28 Thread Jaime Fernández del Río
Have modified the PR to do the "promote integers to at least long" we do in
np.sum.

Jaime

On Mon, Mar 28, 2016 at 9:55 PM, CJ Carey <perimosocord...@gmail.com> wrote:

> Another +1 for Josef's interpretation from me. Consistency with np.sum
> seems like the best option.
>
> On Sat, Mar 26, 2016 at 11:12 PM, Juan Nunez-Iglesias <jni.s...@gmail.com>
> wrote:
>
>> Thanks for clarifying, Jaime, and fwiw I agree with Josef: I would expect
>> np.bincount to behave like np.sum with regards to promoting weights dtypes.
>> Including bool.
>>
>> On Sun, Mar 27, 2016 at 1:58 PM, <josef.p...@gmail.com> wrote:
>>
>>> On Sat, Mar 26, 2016 at 9:54 PM, Joseph Fox-Rabinovitz
>>> <jfoxrabinov...@gmail.com> wrote:
>>> > Would it make sense to just make the output type large enough to hold
>>> the
>>> > cumulative sum of the weights?
>>> >
>>> >
>>> > - Joseph Fox-Rabinovitz
>>> >
>>> > -- Original message------
>>> >
>>> > From: Jaime Fernández del Río
>>> >
>>> > Date: Sat, Mar 26, 2016 16:16
>>> >
>>> > To: Discussion of Numerical Python;
>>> >
>>> > Subject:[Numpy-discussion] Make np.bincount output same dtype as
>>> weights
>>> >
>>> > Hi all,
>>> >
>>> > I have just submitted a PR (#7464) that fixes an enhancement request
>>> > (#6854), making np.bincount return an array of the same type as the
>>> weights
>>> > parameter.  This is an important deviation from current behavior, which
>>> > always casts weights to double, and always returns a double array, so I
>>> > would like to hear what others think about the worthiness of this.
>>> Main
>>> > discussion points:
>>> >
>>> > np.bincount now works with complex weights (yay!), I guess this should
>>> be a
>>> > pretty uncontroversial enhancement.
>>> > The return is of the same type as weights, which means that small
>>> integers
>>> > are very likely to overflow.  This is exactly what #6854 requested, but
>>> > perhaps we should promote the output for integers to a long, as we do
>>> in
>>> > np.sum?
>>>
>>> I always thought of bincount with weights just as a group-by sum. So
>>> it would be easier to remember and have fewer surprises if it matches
>>> the behavior of np.sum.
>>>
>>> > Boolean arrays stay boolean, and OR, rather than sum, the weights. Is
>>> this
>>> > what one would want? If we decide that integer promotion is the way to
>>> go,
>>> > perhaps booleans should go in the same pack?
>>>
>>> Isn't this calculating the sum, i.e. count of True by group, already?
>>> Based on a quick example with numpy 1.9.2, I don't think I ever used
>>> bool weights before.
>>>
>>>
>>> > This new implementation currently supports all of the reasonable native
>>> > types, but has no fallback for user defined types.  I guess we should
>>> > attempt to cast the array to double as before if no native loop can be
>>> > found? It would be good to have a way of testing this though, any
>>> thoughts
>>> > on how to go about this?
>>> > Does a behavior change like this require some deprecation period? What
>>> would
>>> > that look like?
>>> > I have also added broadcasting of weights to the full size of list, so
>>> that
>>> > one can do e.g. np.bincount([1, 2, 3], weights=2j) without having to
>>> tile
>>> > the single weight to the size of the bins list.
>>> >
>>> > Any other thoughts are very welcome as well!
>>>
>>> (2-D weights ?)
>>>
>>>
>>> Josef
>>>
>>>
>>> >
>>> > Jaime
>>> >
>>> > --
>>> > (__/)
>>> > ( O.o)
>>> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
>>> planes de
>>> > dominación mundial.
>>> >
>>> > ___
>>> > NumPy-Discussion mailing list
>>> > NumPy-Discussion@scipy.org
>>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>> >
>>> ___
>>> NumPy-Discussion mailing list
>>> NumPy-Discussion@scipy.org
>>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>>
>>
>>
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>


-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-28 Thread CJ Carey
Another +1 for Josef's interpretation from me. Consistency with np.sum
seems like the best option.

On Sat, Mar 26, 2016 at 11:12 PM, Juan Nunez-Iglesias <jni.s...@gmail.com>
wrote:

> Thanks for clarifying, Jaime, and fwiw I agree with Josef: I would expect
> np.bincount to behave like np.sum with regards to promoting weights dtypes.
> Including bool.
>
> On Sun, Mar 27, 2016 at 1:58 PM, <josef.p...@gmail.com> wrote:
>
>> On Sat, Mar 26, 2016 at 9:54 PM, Joseph Fox-Rabinovitz
>> <jfoxrabinov...@gmail.com> wrote:
>> > Would it make sense to just make the output type large enough to hold
>> the
>> > cumulative sum of the weights?
>> >
>> >
>> > - Joseph Fox-Rabinovitz
>> >
>> > -- Original message--
>> >
>> > From: Jaime Fernández del Río
>> >
>> > Date: Sat, Mar 26, 2016 16:16
>> >
>> > To: Discussion of Numerical Python;
>> >
>> > Subject:[Numpy-discussion] Make np.bincount output same dtype as weights
>> >
>> > Hi all,
>> >
>> > I have just submitted a PR (#7464) that fixes an enhancement request
>> > (#6854), making np.bincount return an array of the same type as the
>> weights
>> > parameter.  This is an important deviation from current behavior, which
>> > always casts weights to double, and always returns a double array, so I
>> > would like to hear what others think about the worthiness of this.  Main
>> > discussion points:
>> >
>> > np.bincount now works with complex weights (yay!), I guess this should
>> be a
>> > pretty uncontroversial enhancement.
>> > The return is of the same type as weights, which means that small
>> integers
>> > are very likely to overflow.  This is exactly what #6854 requested, but
>> > perhaps we should promote the output for integers to a long, as we do in
>> > np.sum?
>>
>> I always thought of bincount with weights just as a group-by sum. So
>> it would be easier to remember and have fewer surprises if it matches
>> the behavior of np.sum.
>>
>> > Boolean arrays stay boolean, and OR, rather than sum, the weights. Is
>> this
>> > what one would want? If we decide that integer promotion is the way to
>> go,
>> > perhaps booleans should go in the same pack?
>>
>> Isn't this calculating the sum, i.e. count of True by group, already?
>> Based on a quick example with numpy 1.9.2, I don't think I ever used
>> bool weights before.
>>
>>
>> > This new implementation currently supports all of the reasonable native
>> > types, but has no fallback for user defined types.  I guess we should
>> > attempt to cast the array to double as before if no native loop can be
>> > found? It would be good to have a way of testing this though, any
>> thoughts
>> > on how to go about this?
>> > Does a behavior change like this require some deprecation period? What
>> would
>> > that look like?
>> > I have also added broadcasting of weights to the full size of list, so
>> that
>> > one can do e.g. np.bincount([1, 2, 3], weights=2j) without having to
>> tile
>> > the single weight to the size of the bins list.
>> >
>> > Any other thoughts are very welcome as well!
>>
>> (2-D weights ?)
>>
>>
>> Josef
>>
>>
>> >
>> > Jaime
>> >
>> > --
>> > (__/)
>> > ( O.o)
>> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
>> planes de
>> > dominación mundial.
>> >
>> > ___
>> > NumPy-Discussion mailing list
>> > NumPy-Discussion@scipy.org
>> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
>> >
>> ___
>> NumPy-Discussion mailing list
>> NumPy-Discussion@scipy.org
>> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>>
>
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-26 Thread Juan Nunez-Iglesias
Thanks for clarifying, Jaime, and fwiw I agree with Josef: I would expect
np.bincount to behave like np.sum with regards to promoting weights dtypes.
Including bool.

On Sun, Mar 27, 2016 at 1:58 PM, <josef.p...@gmail.com> wrote:

> On Sat, Mar 26, 2016 at 9:54 PM, Joseph Fox-Rabinovitz
> <jfoxrabinov...@gmail.com> wrote:
> > Would it make sense to just make the output type large enough to hold the
> > cumulative sum of the weights?
> >
> >
> > - Joseph Fox-Rabinovitz
> >
> > -- Original message--
> >
> > From: Jaime Fernández del Río
> >
> > Date: Sat, Mar 26, 2016 16:16
> >
> > To: Discussion of Numerical Python;
> >
> > Subject:[Numpy-discussion] Make np.bincount output same dtype as weights
> >
> > Hi all,
> >
> > I have just submitted a PR (#7464) that fixes an enhancement request
> > (#6854), making np.bincount return an array of the same type as the
> weights
> > parameter.  This is an important deviation from current behavior, which
> > always casts weights to double, and always returns a double array, so I
> > would like to hear what others think about the worthiness of this.  Main
> > discussion points:
> >
> > np.bincount now works with complex weights (yay!), I guess this should
> be a
> > pretty uncontroversial enhancement.
> > The return is of the same type as weights, which means that small
> integers
> > are very likely to overflow.  This is exactly what #6854 requested, but
> > perhaps we should promote the output for integers to a long, as we do in
> > np.sum?
>
> I always thought of bincount with weights just as a group-by sum. So
> it would be easier to remember and have fewer surprises if it matches
> the behavior of np.sum.
>
> > Boolean arrays stay boolean, and OR, rather than sum, the weights. Is
> this
> > what one would want? If we decide that integer promotion is the way to
> go,
> > perhaps booleans should go in the same pack?
>
> Isn't this calculating the sum, i.e. count of True by group, already?
> Based on a quick example with numpy 1.9.2, I don't think I ever used
> bool weights before.
>
>
> > This new implementation currently supports all of the reasonable native
> > types, but has no fallback for user defined types.  I guess we should
> > attempt to cast the array to double as before if no native loop can be
> > found? It would be good to have a way of testing this though, any
> thoughts
> > on how to go about this?
> > Does a behavior change like this require some deprecation period? What
> would
> > that look like?
> > I have also added broadcasting of weights to the full size of list, so
> that
> > one can do e.g. np.bincount([1, 2, 3], weights=2j) without having to tile
> > the single weight to the size of the bins list.
> >
> > Any other thoughts are very welcome as well!
>
> (2-D weights ?)
>
>
> Josef
>
>
> >
> > Jaime
> >
> > --
> > (__/)
> > ( O.o)
> > ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus
> planes de
> > dominación mundial.
> >
> > ___
> > NumPy-Discussion mailing list
> > NumPy-Discussion@scipy.org
> > https://mail.scipy.org/mailman/listinfo/numpy-discussion
> >
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-26 Thread josef.pktd
On Sat, Mar 26, 2016 at 9:54 PM, Joseph Fox-Rabinovitz
<jfoxrabinov...@gmail.com> wrote:
> Would it make sense to just make the output type large enough to hold the
> cumulative sum of the weights?
>
>
> - Joseph Fox-Rabinovitz
>
> -- Original message--
>
> From: Jaime Fernández del Río
>
> Date: Sat, Mar 26, 2016 16:16
>
> To: Discussion of Numerical Python;
>
> Subject:[Numpy-discussion] Make np.bincount output same dtype as weights
>
> Hi all,
>
> I have just submitted a PR (#7464) that fixes an enhancement request
> (#6854), making np.bincount return an array of the same type as the weights
> parameter.  This is an important deviation from current behavior, which
> always casts weights to double, and always returns a double array, so I
> would like to hear what others think about the worthiness of this.  Main
> discussion points:
>
> np.bincount now works with complex weights (yay!), I guess this should be a
> pretty uncontroversial enhancement.
> The return is of the same type as weights, which means that small integers
> are very likely to overflow.  This is exactly what #6854 requested, but
> perhaps we should promote the output for integers to a long, as we do in
> np.sum?

I always thought of bincount with weights just as a group-by sum. So
it would be easier to remember and have fewer surprises if it matches
the behavior of np.sum.

> Boolean arrays stay boolean, and OR, rather than sum, the weights. Is this
> what one would want? If we decide that integer promotion is the way to go,
> perhaps booleans should go in the same pack?

Isn't this calculating the sum, i.e. count of True by group, already?
Based on a quick example with numpy 1.9.2, I don't think I ever used
bool weights before.


> This new implementation currently supports all of the reasonable native
> types, but has no fallback for user defined types.  I guess we should
> attempt to cast the array to double as before if no native loop can be
> found? It would be good to have a way of testing this though, any thoughts
> on how to go about this?
> Does a behavior change like this require some deprecation period? What would
> that look like?
> I have also added broadcasting of weights to the full size of list, so that
> one can do e.g. np.bincount([1, 2, 3], weights=2j) without having to tile
> the single weight to the size of the bins list.
>
> Any other thoughts are very welcome as well!

(2-D weights ?)


Josef


>
> Jaime
>
> --
> (__/)
> ( O.o)
> ( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de
> dominación mundial.
>
> ___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
>
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-26 Thread Joseph Fox-Rabinovitz






Would it make sense to just make the output type large enough to hold the 
cumulative sum of the weights?
- Joseph Fox-Rabinovitz


-- Original message--From: Jaime Fernández del RíoDate: Sat, Mar 26, 
2016 16:16To: Discussion of Numerical Python;Subject:[Numpy-discussion] Make 
np.bincount output same dtype as weightsHi all,
I have just submitted a PR (#7464) that fixes an enhancement request (#6854), 
making np.bincount return an array of the same type as the weights parameter.  
This is an important deviation from current behavior, which always casts 
weights to double, and always returns a double array, so I would like to hear 
what others think about the worthiness of this.  Main discussion 
points:np.bincount now works with complex weights (yay!), I guess this should 
be a pretty uncontroversial enhancement.The return is of the same type as 
weights, which means that small integers are very likely to overflow.  This is 
exactly what #6854 requested, but perhaps we should promote the output for 
integers to a long, as we do in np.sum?Boolean arrays stay boolean, and OR, 
rather than sum, the weights. Is this what one would want? If we decide that 
integer promotion is the way to go, perhaps booleans should go in the same 
pack?This new implementation currently supports all of the reasonable native 
types, but has no fallback for user defined types.  I guess we should attempt 
to cast the array to double as before if no native loop can be found? It would 
be good to have a way of testing this though, any thoughts on how to go about 
this?Does a behavior change like this require some deprecation period? What 
would that look like?I have also added broadcasting of weights to the full size 
of list, so that one can do e.g. np.bincount([1, 2, 3], weights=2j) without 
having to tile the single weight to the size of the bins list.
Any other thoughts are very welcome as well!
Jaime
-- 
(__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de 
dominación mundial.___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-26 Thread Juan Nunez-Iglesias
Just to clarify, this will only affect weighted bincounts, right? I can't tell 
you in how many places my code depends on the return type being integer!!!


On 27 Mar 2016, 7:16 AM +1100, Jaime Fernández del Río, 
wrote:
> Hi all,
>  
> I have just submitted a PR (#7464(https://github.com/numpy/numpy/pull/7464)) 
> that fixes an enhancement request 
> (#6854(https://github.com/numpy/numpy/issues/6854)), makingnp.bincountreturn 
> an array of the same type as theweightsparameter.This is an important 
> deviation from current behavior, which always castsweightstodouble, and 
> always returns adoublearray, so I would like to hear what others think about 
> the worthiness of this.Main discussion points:
> np.bincountnow works with complex weights (yay!), I guess this should be a 
> pretty uncontroversial enhancement.
> The return is of the same type asweights, which means that small integers are 
> very likely to overflow.This is exactly what #6854 requested, but perhaps we 
> should promote the output for integers to along, as we do innp.sum?
> Boolean arrays stay boolean, and OR, rather than sum, the weights. Is this 
> what one would want? If we decide that integer promotion is the way to go, 
> perhaps booleans should go in the same pack?
> This new implementation currently supports all of the reasonable native 
> types, but has no fallback for user defined types.I guess we should attempt 
> to cast the array to double as before if no native loop can be found? It 
> would be good to have a way of testing this though, any thoughts on how to go 
> about this?
> Does a behavior change like this require some deprecation period? What would 
> that look like?
> I have also added broadcasting of weights to the full size of list, so that 
> one can do e.g.np.bincount([1, 2, 3], weights=2j)without having to tile the 
> single weight to the size of the bins list.
> Any other thoughts are very welcome as well!
>  
> Jaime
>  
> --
> (\__/)
> ( O.o)
> (><) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes de 
> dominación mundial.___
> NumPy-Discussion mailing list
> NumPy-Discussion@scipy.org
> https://mail.scipy.org/mailman/listinfo/numpy-discussion
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Make np.bincount output same dtype as weights

2016-03-26 Thread Jaime Fernández del Río
Hi all,

I have just submitted a PR (#7464 )
that fixes an enhancement request (#6854
), making np.bincount return an
array of the same type as the weights parameter.  This is an important
deviation from current behavior, which always casts weights to double, and
always returns a double array, so I would like to hear what others think
about the worthiness of this.  Main discussion points:

   - np.bincount now works with complex weights (yay!), I guess this should
   be a pretty uncontroversial enhancement.
   - The return is of the same type as weights, which means that small
   integers are very likely to overflow.  This is exactly what #6854
   requested, but perhaps we should promote the output for integers to a
   long, as we do in np.sum?
   - Boolean arrays stay boolean, and OR, rather than sum, the weights. Is
   this what one would want? If we decide that integer promotion is the way to
   go, perhaps booleans should go in the same pack?
   - This new implementation currently supports all of the reasonable
   native types, but has no fallback for user defined types.  I guess we
   should attempt to cast the array to double as before if no native loop can
   be found? It would be good to have a way of testing this though, any
   thoughts on how to go about this?
   - Does a behavior change like this require some deprecation period? What
   would that look like?
   - I have also added broadcasting of weights to the full size of list, so
   that one can do e.g. np.bincount([1, 2, 3], weights=2j) without having
   to tile the single weight to the size of the bins list.

Any other thoughts are very welcome as well!

Jaime

-- 
(\__/)
( O.o)
( > <) Este es Conejo. Copia a Conejo en tu firma y ayúdale en sus planes
de dominación mundial.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
https://mail.scipy.org/mailman/listinfo/numpy-discussion