Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Erik Rigtorp
On Mon, Jan 3, 2011 at 05:13, Sebastian Haase seb.ha...@gmail.com wrote:
 Hi Erik,
 This is really neat !  Do I understand correctly, that you mean by
 stride tricks, that your rolling_window is _not_ allocating any new
 memory ?
Yes, it's only a view.

 IOW, If I have a large array using 500MB of memory, say of float32 of
 shape 125,1000,1000 and I want the last axis rolling of window size
 11, what would the peak memory usage of that operation be ?
It's only a view of the array, no copying is done. Though some
operations like np.std()  will copy the array, but that's more of a
bug. In general It's hard to imagine any speedup gains by copying a
10GB array.

 How about renaming the option `window` to `window_size`  (first I was
 thinking of things like hamming and hanning windows...)... ?
Sounds fare.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Erik Rigtorp
On Mon, Jan 3, 2011 at 10:52, Keith Goodman kwgood...@gmail.com wrote:
 On Mon, Jan 3, 2011 at 7:41 AM, Erik Rigtorp e...@rigtorp.com wrote:
 On Mon, Jan 3, 2011 at 10:36, Keith Goodman kwgood...@gmail.com wrote:
 On Mon, Jan 3, 2011 at 5:37 AM, Erik Rigtorp e...@rigtorp.com wrote:

 It's only a view of the array, no copying is done. Though some
 operations like np.std()  will copy the array, but that's more of a
 bug. In general It's hard to imagine any speedup gains by copying a
 10GB array.

 I don't think that np.std makes a copy of the input data if the input
 is an array. If the input is, for example, a list, then an array is
 created.

 When I tried it on a big array, it tried to allocate a huge amount of
 memory. As I said it's probably a bug.

 Yes, that would be a big bug.

 np.std does have to initialize the output array. If the window size is
 small compared to arr.shape[axis] then the memory taken by the output
 array is of the same order as that of the input array. Could that be
 what you are seeing?

 a = np.arange(10)

 Small window, output array shape (8,):

 rolling_window(a, 2)
 array([[0, 1],
       [1, 2],
       [2, 3],
       [3, 4],
       [4, 5],
       [5, 6],
       [6, 7],
       [7, 8],
       [8, 9]])

 Big window, output array shape (2,):

 rolling_window(a, 9)
 array([[0, 1, 2, 3, 4, 5, 6, 7, 8],
       [1, 2, 3, 4, 5, 6, 7, 8, 9]])

No the array was (500,2000) and i did np.std(rolling_window(a,252),-1)
and it started to allocate  2GB.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)

2011-01-03 Thread Erik Rigtorp
On Mon, Jan 3, 2011 at 11:26, Eric Firing efir...@hawaii.edu wrote:
 Instead of calculating statistics independently each time the window is
 advanced one data point, the statistics are updated.  I have not done
 any benchmarking, but I expect this approach to be quick.

This might accumulate numerical errors. But could be fine for many applications.

 The code is old; I have not tried to update it to take advantage of
 cython's advances over pyrex.  If I were writing it now, I might not
 bother with the C level at all; it could all be done in cython, probably
 with no speed penalty, and maybe even with reduced overhead.


No doubt this would be faster, I just wanted to offer a general way to
this in NumPy.
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Faster NaN functions

2010-12-31 Thread Erik Rigtorp
Hi,

I just send a pull request for some faster NaN functions,
https://github.com/rigtorp/numpy.

I implemented the following generalized ufuncs: nansum(), nancumsum(),
nanmean(), nanstd() and for fun mean() and std(). It turns out that
the generalized ufunc mean() and std() is faster than the current
numpy functions. I'm also going to add nanprod(), nancumprod(),
nanmax(), nanmin(), nanargmax(), nanargmin().

The current implementation is not optimized in any way and there are
probably some speedups possible.

I hope we can get this into numpy 2.0, me and people around me seems
to have a need for these functions.

Erik
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


Re: [Numpy-discussion] Simple shared arrays

2010-12-31 Thread Erik Rigtorp
On Fri, Dec 31, 2010 at 02:13, Paul Ivanov pivanov...@gmail.com wrote:
 Erik Rigtorp, on 2010-12-30 21:30,  wrote:
 Hi,

 I was trying to parallelize some algorithms and needed a writable
 array shared between processes. It turned out to be quite simple and
 gave a nice speed up almost linear in number of cores. Of course you
 need to know what you are doing to avoid segfaults and such. But I
 still think something like this should be included with NumPy for
 power users.

 This works by inheriting anonymous mmaped memory. Not sure if this
 works on windows.
 --snip--

 I've successfully used (what I think is) Sturla Molden's
 shmem_as_ndarray as outline here [1] and here [2] for these
 purposes.


Yeah, i saw that code too. My implementation is even more lax, but
easier to use. It sends arrays by memory reference to subprocesses.
Dangerous: yes, effective: very.

It would be nice if we could stamp out some good effective patterns
using multiprocessing and include them with numpy. The best solution
is probably a parallel_for function:
def parallel_for(func, inherit_args, iterable): ...
Where func should be def func(inherit_args, item): ...
And parallel_for makes sure inherit_args are viewable as a class
shared() with writable shared memory.

Erik
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Rolling window (moving average, moving std, and more)

2010-12-31 Thread Erik Rigtorp
Hi,

Implementing moving average, moving std and other functions working
over rolling windows using python for loops are slow. This is a
effective stride trick I learned from Keith Goodman's
kwgood...@gmail.com Bottleneck code but generalized into arrays of
any dimension. This trick allows the loop to be performed in C code
and in the future hopefully using multiple cores.

import numpy as np

def rolling_window(a, window):

Make an ndarray with a rolling window of the last dimension

Parameters
--
a : array_like
Array to add rolling window to
window : int
Size of rolling window

Returns
---
Array that is a view of the original array with a added dimension
of size w.

Examples

 x=np.arange(10).reshape((2,5))
 rolling_window(x, 3)
array([[[0, 1, 2], [1, 2, 3], [2, 3, 4]],
   [[5, 6, 7], [6, 7, 8], [7, 8, 9]]])

Calculate rolling mean of last dimension:
 np.mean(rolling_window(x, 3), -1)
array([[ 1.,  2.,  3.],
   [ 6.,  7.,  8.]])


if window  1:
raise ValueError, `window` must be at least 1.
if window  a.shape[-1]:
raise ValueError, `window` is too long.
shape = a.shape[:-1] + (a.shape[-1] - window + 1, window)
strides = a.strides + (a.strides[-1],)
return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides)


Using np.swapaxes(-1, axis) rolling aggregations over any axis can be computed.

I submitted a pull request to add this to the stride_tricks module.

Erik
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion


[Numpy-discussion] Simple shared arrays

2010-12-30 Thread Erik Rigtorp
Hi,

I was trying to parallelize some algorithms and needed a writable
array shared between processes. It turned out to be quite simple and
gave a nice speed up almost linear in number of cores. Of course you
need to know what you are doing to avoid segfaults and such. But I
still think something like this should be included with NumPy for
power users.

This works by inheriting anonymous mmaped memory. Not sure if this
works on windows.

import numpy as np
import multiprocessing as mp

class shared(np.ndarray):
Shared writable array

def __new__(subtype, shape, interface=None):
size = np.prod(shape)
if interface == None:
buffer = mp.RawArray('d', size)
self = np.ndarray.__new__(subtype, shape, float, buffer)
else:
class Dummy(object): pass
buffer = Dummy()
buffer.__array_interface__ = interface
a = np.asarray(buffer)
self = np.ndarray.__new__(subtype, shape=a.shape, buffer=a)
return self

def __reduce_ex__(self, protocol):
return shared, (self.shape, self.__array_interface__)

def __reduce__(self):
return __reduce_ex__(self, 0)

Also see attached file for example usage.

Erik
#!/usr/bin/env python

import numpy as np
import multiprocessing as mp

class shared(np.ndarray):
Shared writable array

def __new__(subtype, shape, interface=None):
size = np.prod(shape)
if interface == None:
buffer = mp.RawArray('d', size) 
self = np.ndarray.__new__(subtype, shape, float, buffer)
else:
class Dummy(object): pass
buffer = Dummy()
buffer.__array_interface__ = interface
a = np.asarray(buffer)
self = np.ndarray.__new__(subtype, shape=a.shape, buffer=a)
return self

def __reduce_ex__(self, protocol):
return shared, (self.shape, self.__array_interface__)

def __reduce__(self):
return __reduce_ex__(self, 0)


def poolmap(func, iterable, chunksize=None):
p = mp.Pool()
return p.map(func, iterable, chunksize)

def std((a,o,i)):
o[i] = np.std(a[i])

def parallel_std(a):
o = shared(a.shape[0])
poolmap(std, [(a, o, i) for i in range(a.shape[0])])
return o

def main():
a = shared((1000,1))
a[:,:] = np.random.rand(1000,1)

import timeit
print timeit.timeit(lambda: np.std(a,1), number=100)/100
print timeit.timeit(lambda: parallel_std(a), number=100)/100
#parallel_std(a)
#np.std(a,1)


if __name__ == __main__:
main()
___
NumPy-Discussion mailing list
NumPy-Discussion@scipy.org
http://mail.scipy.org/mailman/listinfo/numpy-discussion