Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-09 Thread Bruce Southey
Hi, I should have asked first (I hope that you don't mind), but I created a ticket Ticket #728 (http://scipy.org/scipy/numpy/ticket/728 ) for numpy.r_ because this incorrectly casts based on the array types. The bug is that -inf and inf are numpy floats but dbin is an array of ints.

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-08 Thread Hans Meine
Am Montag, 07. April 2008 14:34:08 schrieb Hans Meine: Am Samstag, 05. April 2008 21:54:27 schrieb Anne Archibald: There's also a fourth option - raise an exception if any points are outside the range. +1 I think this should be the default. Otherwise, I tend towards exclude, in order to

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-08 Thread David Huard
Hans, Note that the current histogram is buggy, in the sense that it assumes that all bins have the same width and computes db = bins[1]-bin[0]. This is why you get zeros everywhere. The current behavior has been heavily criticized and I think we should change it. My proposal is to have for

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-08 Thread Bruce Southey
Hi, I agree that the current histogram should be changed. However, I am not sure 1.0.5 is the correct release for that. David, this doesn't work for your code: r= np.array([1,2,2,3,3,3,4,4,4,4,5,5,5,5,5]) dbin=[2,3,4] rc, rb=histogram(r, bins=dbin, discard=None) Returns: rc=[3 3] # Really

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-08 Thread David Huard
2008/4/8, Bruce Southey [EMAIL PROTECTED]: Hi, I agree that the current histogram should be changed. However, I am not sure 1.0.5 is the correct release for that. We both agree. David, this doesn't work for your code: r= np.array([1,2,2,3,3,3,4,4,4,4,5,5,5,5,5]) dbin=[2,3,4] rc,

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-07 Thread Hans Meine
Am Samstag, 05. April 2008 21:54:27 schrieb Anne Archibald: There's also a fourth option - raise an exception if any points are outside the range. +1 I think this should be the default. Otherwise, I tend towards exclude, in order to have comparable bin sizes (when plotting, I always find

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-07 Thread David Huard
+1 for an outlier keyword. Note, that this implies that when bins are passed explicitly, the edges are given (nbins+1), not simply the left edges (nbins). While we are refactoring histogram, I'd suggest adding an axis keyword. This is pretty straightforward to implement using the

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-07 Thread Bruce Southey
Hi, Thanks David for pointing the piece of information I forgot to add in my original email. -1 for 'raise an exception' because, as Dan points out, the problem stems from user providing bins. +1 for the outliers keyword. Should 'exclude' distinguish points that are too low and those that are

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-07 Thread LB
+1 for axis and +1 for a keyword to define what to do with values outside the range. For the keyword, ather than 'outliers', I would propose 'discard' or 'exclude', because it could be used to describe the four possibilities : - discard='low' = values lower than the range are discarded,

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-07 Thread Tommy Grav
On Apr 7, 2008, at 4:14 PM, LB wrote: +1 for axis and +1 for a keyword to define what to do with values outside the range. For the keyword, ather than 'outliers', I would propose 'discard' or 'exclude', because it could be used to describe the four possibilities : - discard='low' =

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-07 Thread David Huard
On Apr 7, 2008, at 4:14 PM, LB wrote: +1 for axis and +1 for a keyword to define what to do with values outside the range. For the keyword, ather than 'outliers', I would propose 'discard' or 'exclude', because it could be used to describe the four possibilities : - discard='low'

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-06 Thread Tommy Grav
On Apr 5, 2008, at 2:01 PM, Bruce Southey wrote: Hi, I have been investigating Ticket #605 'Incorrect behavior of numpy.histogram' (http://scipy.org/scipy/numpy/ticket/605 ). I think that my preference depends on the definition of what the bin number means. If the bin numbers are the lower

[Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-05 Thread Bruce Southey
Hi, I have been investigating Ticket #605 'Incorrect behavior of numpy.histogram' (http://scipy.org/scipy/numpy/ticket/605 ). The fix for this ticket really depends on what the expectations are for the bin limits and different applications have different behavior. Consequently, I think that

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-05 Thread James Philbin
The matlab behaviour is to extend the first bin to include all data down to -inf and extend the last bin to handle all data to inf. This is probably the behaviour with least suprise. Therefor, I would vote +1 for behaviour #1 by default, +1 for keeping the old behaviour #2 around as an option and

Re: [Numpy-discussion] Ticket #605 Incorrect behavior of numpy.histogram

2008-04-05 Thread Anne Archibald
On 05/04/2008, Bruce Southey [EMAIL PROTECTED] wrote: 1) Should the first bin contain all values less than or equal to the value of the first limit and the last bin contain all values greater than the value of the last limit? This produced the counts as: array([3, 3, 9]) (I termed this