Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Jaime Fernández del Río
On Tue, Apr 14, 2015 at 6:16 PM, Neil Girdhar wrote: > If you're going to C, is there a reason not to go to C++ and include the > already-written Boost code? Otherwise, why not use Python? > I think we have an explicit rule against C++, although I may be wrong. Not sure how much of boost we wou

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-14 Thread Nathaniel Smith
I am, yes. On Apr 14, 2015 9:17 PM, "Neil Girdhar" wrote: > Ok, I didn't know that. Are you at pycon by any chance? > > On Tue, Apr 14, 2015 at 7:16 PM, Nathaniel Smith wrote: > >> On Tue, Apr 14, 2015 at 3:48 PM, Neil Girdhar >> wrote: >> > Yes, I totally agree with you regarding np.sum and n

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Neil Girdhar
By the way, the p^2 algorithm still needs to know how many bins you want. It just adapts the endpoints of the bins. I like adaptive=True. However, you will have to find a way to return both the bins and and their calculated endpoints. The P^2 algorithm can also give approximate answers to numpy.

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Paul Hobson
On Tue, Apr 14, 2015 at 4:24 PM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > On Tue, Apr 14, 2015 at 4:12 PM, Nathaniel Smith wrote: > >> On Mon, Apr 13, 2015 at 8:02 AM, Neil Girdhar >> wrote: >> > Can I suggest that we instead add the P-square algorithm for the dynamic >> > calcul

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-14 Thread Neil Girdhar
Ok, I didn't know that. Are you at pycon by any chance? On Tue, Apr 14, 2015 at 7:16 PM, Nathaniel Smith wrote: > On Tue, Apr 14, 2015 at 3:48 PM, Neil Girdhar > wrote: > > Yes, I totally agree with you regarding np.sum and np.product, which is > why > > I didn't suggest np.add.reduce, np.mult

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Neil Girdhar
If you're going to C, is there a reason not to go to C++ and include the already-written Boost code? Otherwise, why not use Python? On Tue, Apr 14, 2015 at 7:24 PM, Jaime Fernández del Río < jaime.f...@gmail.com> wrote: > On Tue, Apr 14, 2015 at 4:12 PM, Nathaniel Smith wrote: > >> On Mon, Apr

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Jaime Fernández del Río
On Tue, Apr 14, 2015 at 4:12 PM, Nathaniel Smith wrote: > On Mon, Apr 13, 2015 at 8:02 AM, Neil Girdhar > wrote: > > Can I suggest that we instead add the P-square algorithm for the dynamic > > calculation of histograms? > > ( > http://pierrechainais.ec-lille.fr/Centrale/Option_DAD/IMPACT_files/

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-14 Thread Nathaniel Smith
On Tue, Apr 14, 2015 at 3:48 PM, Neil Girdhar wrote: > Yes, I totally agree with you regarding np.sum and np.product, which is why > I didn't suggest np.add.reduce, np.multiply.reduce. I wasn't sure whether > cumsum and cumprod might be on the line in your judgment. Ah, I see. I think we should

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Nathaniel Smith
On Mon, Apr 13, 2015 at 8:02 AM, Neil Girdhar wrote: > Can I suggest that we instead add the P-square algorithm for the dynamic > calculation of histograms? > (http://pierrechainais.ec-lille.fr/Centrale/Option_DAD/IMPACT_files/Dynamic%20quantiles%20calcultation%20-%20P2%20Algorythm.pdf) > > This i

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Neil Girdhar
Yes, you're right. Although in practice, people almost always want adaptive bins. On Tue, Apr 14, 2015 at 5:08 PM, Chris Barker wrote: > On Mon, Apr 13, 2015 at 5:02 AM, Neil Girdhar > wrote: > >> Can I suggest that we instead add the P-square algorithm for the dynamic >> calculation of histog

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Chris Barker
On Mon, Apr 13, 2015 at 5:02 AM, Neil Girdhar wrote: > Can I suggest that we instead add the P-square algorithm for the dynamic > calculation of histograms? ( > http://pierrechainais.ec-lille.fr/Centrale/Option_DAD/IMPACT_files/Dynamic%20quantiles%20calcultation%20-%20P2%20Algorythm.pdf > ) > T

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Antony Lee
Another improvement would be to make sure, for integer-valued datasets, that all bins cover the same number of integer, as it is easy to end up otherwise with bins "effectively" wider than others: hist(np.random.randint(11, size=1)) shows a peak in the last bin, as it covers both 9 and 10. A

Re: [Numpy-discussion] Automatic number of bins for numpy histograms

2015-04-14 Thread Neil Girdhar
Can I suggest that we instead add the P-square algorithm for the dynamic calculation of histograms? ( http://pierrechainais.ec-lille.fr/Centrale/Option_DAD/IMPACT_files/Dynamic%20quantiles%20calcultation%20-%20P2%20Algorythm.pdf ) This is already implemented in C++'s boost library ( http://www.bo

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-14 Thread Neil Girdhar
Yes, I totally agree with you regarding np.sum and np.product, which is why I didn't suggest np.add.reduce, np.multiply.reduce. I wasn't sure whether cumsum and cumprod might be on the line in your judgment. Best, Neil On Tue, Apr 14, 2015 at 3:37 PM, Nathaniel Smith wrote: > On Apr 14, 2015

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-14 Thread Nathaniel Smith
On Apr 14, 2015 2:48 PM, "Neil Girdhar" wrote: > > Okay, but by the same token, why do we have cumsum? Isn't it identical to > > np.add.accumulate > > — or if you're passing in multidimensional data — > > np.add.accumulate(a.flatten()) > > ? > > add.accumulate feels more generic, would make the o

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-14 Thread Neil Girdhar
Okay, but by the same token, why do we have cumsum? Isn't it identical to np.add.accumulate — or if you're passing in multidimensional data — np.add.accumulate(a.flatten()) ? add.accumulate feels more generic, would make the other ufunc things more discoverable, and is self-documenting. Simi

Re: [Numpy-discussion] Consider improving numpy.outer's behavior with zero-dimensional vectors

2015-04-14 Thread Neil Girdhar
It also appears that cumsum has a lot of unnecessary overhead over add.accumulate: In [51]: %timeit np.add.accumulate(a) The slowest run took 46.31 times longer than the fastest. This could mean that an intermediate result is being cached 100 loops, best of 3: 372 ns per loop In [52]: %timeit

[Numpy-discussion] ANN: numexpr 2.4.1 released

2015-04-14 Thread Francesc Alted
= Announcing Numexpr 2.4.1 = Numexpr is a fast numerical expression evaluator for NumPy. With it, expressions that operate on arrays (like "3*a+4*b") are accelerated and use less memory than doing the same calculation in Python. It wears multi-thr