On 11/15/2011 06:02 PM, Warren Weckesser wrote:
On Tue, Nov 15, 2011 at 10:48 AM, Andreas Müller
<amuel...@ais.uni-bonn.de <mailto:amuel...@ais.uni-bonn.de>> wrote:
On 11/15/2011 05:46 PM, Andreas Müller wrote:
On 11/15/2011 04:28 PM, Bruce Southey wrote:
On 11/14/2011 10:05 AM, Andreas Müller wrote:
On 11/14/2011 04:23 PM, David Cournapeau wrote:
On Mon, Nov 14, 2011 at 12:46 PM, Andreas Müller
<amuel...@ais.uni-bonn.de> <mailto:amuel...@ais.uni-bonn.de> wrote:
Hi everybody.
When I did some normalization using numpy, I noticed that numpy.std uses
more ram than I was expecting.
A quick google search gave me this:
http://luispedro.org/software/ncreduce
The site claims that std and other reduce operations are implemented
naively with many temporaries.
Is that true? And if so, is there a particular reason for that?
This issues seems quite easy to fix.
In particular the link I gave above provides code.
The code provided only implements a few special cases: being more
efficient in those cases only is indeed easy.
I am particularly interested in the std function.
Is this implemented as a separate function or an instantiation
of a general reduce operations?
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org <mailto:NumPy-Discussion@scipy.org>
http://mail.scipy.org/mailman/listinfo/numpy-discussion
The'On-line algorithm'
(http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm)
<http://en.wikipedia.org/wiki/Algorithms_for_calculating_variance#On-line_algorithm>
could save you storage. I would presume if you know cython that
you can probably make it quick as well (to address the loop over
the data).
My question was more along the lines of "why doesn't numpy do the
online algorithm".
To be more precise, even not using the online version but
computing E(X^2) and E(X)^2 would be good.
It seems numpy centers the whole dataset. Otherwise I can't
explain why the memory needed should depend
on the number of examples.
Yes, that is what it is doing. See line 63 in the function _var(),
which is called by _std():
https://github.com/numpy/numpy/blob/master/numpy/core/_methods.py
Thanks for the clarification. I thought the function was somewhere in
the C code -
don't know why.
I'll see if I can reformulate the function.
_______________________________________________
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion