Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Sebastian Haase
Hi Erik,
This is really neat !  Do I understand correctly, that you mean by
stride tricks, that your rolling_window is _not_ allocating any new
memory ?
IOW, If I have a large array using 500MB of memory, say of float32 of
shape 125,1000,1000 and I want the last axis rolling of window size
11, what would the peak memory usage of that operation be ?
How about renaming the option `window` to `window_size`  (first I was
thinking of things like hamming and hanning windows...)... ?

Thanks,
Sebastian Haase


On Sat, Jan 1, 2011 at 5:29 AM, Erik Rigtorp e...@rigtorp.com wrote:
 Hi,

 Implementing moving average, moving std and other functions working
 over rolling windows using python for loops are slow. This is a
 effective stride trick I learned from Keith Goodman's
 kwgood...@gmail.com Bottleneck code but generalized into arrays of
 any dimension. This trick allows the loop to be performed in C code
 and in the future hopefully using multiple cores.

 import numpy as np

 def rolling_window(a, window):
    
    Make an ndarray with a rolling window of the last dimension

    Parameters
    --
    a : array_like
        Array to add rolling window to
    window : int
        Size of rolling window

    Returns
    ---
    Array that is a view of the original array with a added dimension
    of size w.

    Examples
    
     x=np.arange(10).reshape((2,5))
     rolling_window(x, 3)
    array([[[0, 1, 2], [1, 2, 3], [2, 3, 4]],
           [[5, 6, 7], [6, 7, 8], [7, 8, 9]]])

    Calculate rolling mean of last dimension:
     np.mean(rolling_window(x, 3), -1)
    array([[ 1.,  2.,  3.],
           [ 6.,  7.,  8.]])

    
    if window  1:
        raise ValueError, `window` must be at least 1.
    if window  a.shape[-1]:
        raise ValueError, `window` is too long.
    shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
    strides = a.strides + (a.strides[-1],)
    return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)


 Using np.swapaxes(-1, axis) rolling aggregations over any axis can be 
 computed.

 I submitted a pull request to add this to the stride_tricks module.

 Erik
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Erik Rigtorp
On Mon, Jan 3, 2011 at 05:13, Sebastian Haase seb.ha...@gmail.com wrote:
 Hi Erik,
 This is really neat !  Do I understand correctly, that you mean by
 stride tricks, that your rolling_window is _not_ allocating any new
 memory ?
Yes, it's only a view.

 IOW, If I have a large array using 500MB of memory, say of float32 of
 shape 125,1000,1000 and I want the last axis rolling of window size
 11, what would the peak memory usage of that operation be ?
It's only a view of the array, no copying is done. Though some
operations like np.std()  will copy the array, but that's more of a
bug. In general It's hard to imagine any speedup gains by copying a
10GB array.

 How about renaming the option `window` to `window_size`  (first I was
 thinking of things like hamming and hanning windows...)... ?
Sounds fare.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Keith Goodman
On Fri, Dec 31, 2010 at 8:29 PM, Erik Rigtorp e...@rigtorp.com wrote:

 Implementing moving average, moving std and other functions working
 over rolling windows using python for loops are slow. This is a
 effective stride trick I learned from Keith Goodman's
 kwgood...@gmail.com Bottleneck code but generalized into arrays of
 any dimension. This trick allows the loop to be performed in C code
 and in the future hopefully using multiple cores.

I like using strides for moving window functions. The one downside I
found is that it is slow when window * (arr.shape[axis] - window) is
large:

 a = np.random.rand(100)
 b = rolling_window(a, 5000)

 import bottleneck as bn
 timeit bn.nanmean(b, axis=1)
1 loops, best of 3: 7.1 s per loop
 timeit bn.move_nanmean(a, window=5000, axis=0)
100 loops, best of 3: 7.99 ms per loop
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Keith Goodman
On Mon, Jan 3, 2011 at 5:37 AM, Erik Rigtorp e...@rigtorp.com wrote:

 It's only a view of the array, no copying is done. Though some
 operations like np.std()  will copy the array, but that's more of a
 bug. In general It's hard to imagine any speedup gains by copying a
 10GB array.

I don't think that np.std makes a copy of the input data if the input
is an array. If the input is, for example, a list, then an array is
created.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Keith Goodman
On Mon, Jan 3, 2011 at 7:41 AM, Erik Rigtorp e...@rigtorp.com wrote:
 On Mon, Jan 3, 2011 at 10:36, Keith Goodman kwgood...@gmail.com wrote:
 On Mon, Jan 3, 2011 at 5:37 AM, Erik Rigtorp e...@rigtorp.com wrote:

 It's only a view of the array, no copying is done. Though some
 operations like np.std()  will copy the array, but that's more of a
 bug. In general It's hard to imagine any speedup gains by copying a
 10GB array.

 I don't think that np.std makes a copy of the input data if the input
 is an array. If the input is, for example, a list, then an array is
 created.

 When I tried it on a big array, it tried to allocate a huge amount of
 memory. As I said it's probably a bug.

Yes, that would be a big bug.

np.std does have to initialize the output array. If the window size is
small compared to arr.shape[axis] then the memory taken by the output
array is of the same order as that of the input array. Could that be
what you are seeing?

 a = np.arange(10)

Small window, output array shape (8,):

 rolling_window(a, 2)
array([[0, 1],
   [1, 2],
   [2, 3],
   [3, 4],
   [4, 5],
   [5, 6],
   [6, 7],
   [7, 8],
   [8, 9]])

Big window, output array shape (2,):

 rolling_window(a, 9)
array([[0, 1, 2, 3, 4, 5, 6, 7, 8],
   [1, 2, 3, 4, 5, 6, 7, 8, 9]])
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Erik Rigtorp
On Mon, Jan 3, 2011 at 10:52, Keith Goodman kwgood...@gmail.com wrote:
 On Mon, Jan 3, 2011 at 7:41 AM, Erik Rigtorp e...@rigtorp.com wrote:
 On Mon, Jan 3, 2011 at 10:36, Keith Goodman kwgood...@gmail.com wrote:
 On Mon, Jan 3, 2011 at 5:37 AM, Erik Rigtorp e...@rigtorp.com wrote:

 It's only a view of the array, no copying is done. Though some
 operations like np.std()  will copy the array, but that's more of a
 bug. In general It's hard to imagine any speedup gains by copying a
 10GB array.

 I don't think that np.std makes a copy of the input data if the input
 is an array. If the input is, for example, a list, then an array is
 created.

 When I tried it on a big array, it tried to allocate a huge amount of
 memory. As I said it's probably a bug.

 Yes, that would be a big bug.

 np.std does have to initialize the output array. If the window size is
 small compared to arr.shape[axis] then the memory taken by the output
 array is of the same order as that of the input array. Could that be
 what you are seeing?

 a = np.arange(10)

 Small window, output array shape (8,):

 rolling_window(a, 2)
 array([[0, 1],
       [1, 2],
       [2, 3],
       [3, 4],
       [4, 5],
       [5, 6],
       [6, 7],
       [7, 8],
       [8, 9]])

 Big window, output array shape (2,):

 rolling_window(a, 9)
 array([[0, 1, 2, 3, 4, 5, 6, 7, 8],
       [1, 2, 3, 4, 5, 6, 7, 8, 9]])

No the array was (500,2000) and i did np.std(rolling_window(a,252),-1)
and it started to allocate  2GB.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Eric Firing
On 12/31/2010 06:29 PM, Erik Rigtorp wrote:
 Hi,

 Implementing moving average, moving std and other functions working
 over rolling windows using python for loops are slow. This is a
 effective stride trick I learned from Keith Goodman's
 kwgood...@gmail.com  Bottleneck code but generalized into arrays of
 any dimension. This trick allows the loop to be performed in C code
 and in the future hopefully using multiple cores.


An alternative is to go straight to C, with a cython interface.  If you 
look in the num/src subdirectory of 
http://currents.soest.hawaii.edu/hgstage/hgwebdir.cgi/pycurrents/ you 
will find this approach, labeled ringbuf and runstats.  See 
pycurrents/setup.py, and its driver, pycurrents/runsetup.py, to see how 
runstats is presently being built.

Instead of calculating statistics independently each time the window is 
advanced one data point, the statistics are updated.  I have not done 
any benchmarking, but I expect this approach to be quick.

The code is old; I have not tried to update it to take advantage of 
cython's advances over pyrex.  If I were writing it now, I might not 
bother with the C level at all; it could all be done in cython, probably 
with no speed penalty, and maybe even with reduced overhead.

Eric
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Erik Rigtorp
On Mon, Jan 3, 2011 at 11:26, Eric Firing efir...@hawaii.edu wrote:
 Instead of calculating statistics independently each time the window is
 advanced one data point, the statistics are updated.  I have not done
 any benchmarking, but I expect this approach to be quick.

This might accumulate numerical errors. But could be fine for many applications.

 The code is old; I have not tried to update it to take advantage of
 cython's advances over pyrex.  If I were writing it now, I might not
 bother with the C level at all; it could all be done in cython, probably
 with no speed penalty, and maybe even with reduced overhead.


No doubt this would be faster, I just wanted to offer a general way to
this in NumPy.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] numpy installation

2011-01-03 Thread Paul Ivanov
Waqar Rashid, on 2011-01-02 00:38,  wrote:
 Hi,
 
 trying to install numpy on MacOS with python 3.1 
 
 Having installation issues. Has anyone managed to install this on the Mac?
 
 regards
 

Waqar - you sent this to the IPython-User list, but I think you
probably meant to send it to the numpy-discussion list, since
your question does not pertain to IPython itself, so I'm
forwarding your email there.

Also, can you be more specific about what issues you are having?

best,
-- 
Paul Ivanov
314 address only used for lists,  off-list direct email at:
http://pirsquared.org | GPG/PGP key id: 0x0F3E28F7 


signature.asc
Description: Digital signature
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] newbie question (curve fitting Z=f(X,Y))

2011-01-03 Thread Sachin Kumar Sharma
Hi,

Absolute basic question I want to do following


*Read data from excel sheet (three columns - X,Y,Z)

*Curve fit a function Z=F(X,Y)

*Plot X Vs Z (from data) and plot X Vs Z (from curve fit)


Kindly advise me how to write a basic python script for the same.

Thanks  Regards

Sachin


Sachin Kumar Sharma
Senior Geomodeler



___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion