Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)
On Mon, Jan 3, 2011 at 05:13, Sebastian Haase seb.ha...@gmail.com wrote: Hi Erik, This is really neat ! Do I understand correctly, that you mean by stride tricks, that your rolling_window is _not_ allocating any new memory ? Yes, it's only a view. IOW, If I have a large array using 500MB of memory, say of float32 of shape 125,1000,1000 and I want the last axis rolling of window size 11, what would the peak memory usage of that operation be ? It's only a view of the array, no copying is done. Though some operations like np.std() will copy the array, but that's more of a bug. In general It's hard to imagine any speedup gains by copying a 10GB array. How about renaming the option `window` to `window_size` (first I was thinking of things like hamming and hanning windows...)... ? Sounds fare. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)
On Mon, Jan 3, 2011 at 10:52, Keith Goodman kwgood...@gmail.com wrote: On Mon, Jan 3, 2011 at 7:41 AM, Erik Rigtorp e...@rigtorp.com wrote: On Mon, Jan 3, 2011 at 10:36, Keith Goodman kwgood...@gmail.com wrote: On Mon, Jan 3, 2011 at 5:37 AM, Erik Rigtorp e...@rigtorp.com wrote: It's only a view of the array, no copying is done. Though some operations like np.std() will copy the array, but that's more of a bug. In general It's hard to imagine any speedup gains by copying a 10GB array. I don't think that np.std makes a copy of the input data if the input is an array. If the input is, for example, a list, then an array is created. When I tried it on a big array, it tried to allocate a huge amount of memory. As I said it's probably a bug. Yes, that would be a big bug. np.std does have to initialize the output array. If the window size is small compared to arr.shape[axis] then the memory taken by the output array is of the same order as that of the input array. Could that be what you are seeing? a = np.arange(10) Small window, output array shape (8,): rolling_window(a, 2) array([[0, 1], [1, 2], [2, 3], [3, 4], [4, 5], [5, 6], [6, 7], [7, 8], [8, 9]]) Big window, output array shape (2,): rolling_window(a, 9) array([[0, 1, 2, 3, 4, 5, 6, 7, 8], [1, 2, 3, 4, 5, 6, 7, 8, 9]]) No the array was (500,2000) and i did np.std(rolling_window(a,252),-1) and it started to allocate 2GB. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Rolling window (moving average, moving std, and more)
On Mon, Jan 3, 2011 at 11:26, Eric Firing efir...@hawaii.edu wrote: Instead of calculating statistics independently each time the window is advanced one data point, the statistics are updated. I have not done any benchmarking, but I expect this approach to be quick. This might accumulate numerical errors. But could be fine for many applications. The code is old; I have not tried to update it to take advantage of cython's advances over pyrex. If I were writing it now, I might not bother with the C level at all; it could all be done in cython, probably with no speed penalty, and maybe even with reduced overhead. No doubt this would be faster, I just wanted to offer a general way to this in NumPy. ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Faster NaN functions
Hi, I just send a pull request for some faster NaN functions, https://github.com/rigtorp/numpy. I implemented the following generalized ufuncs: nansum(), nancumsum(), nanmean(), nanstd() and for fun mean() and std(). It turns out that the generalized ufunc mean() and std() is faster than the current numpy functions. I'm also going to add nanprod(), nancumprod(), nanmax(), nanmin(), nanargmax(), nanargmin(). The current implementation is not optimized in any way and there are probably some speedups possible. I hope we can get this into numpy 2.0, me and people around me seems to have a need for these functions. Erik ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
Re: [Numpy-discussion] Simple shared arrays
On Fri, Dec 31, 2010 at 02:13, Paul Ivanov pivanov...@gmail.com wrote: Erik Rigtorp, on 2010-12-30 21:30, wrote: Hi, I was trying to parallelize some algorithms and needed a writable array shared between processes. It turned out to be quite simple and gave a nice speed up almost linear in number of cores. Of course you need to know what you are doing to avoid segfaults and such. But I still think something like this should be included with NumPy for power users. This works by inheriting anonymous mmaped memory. Not sure if this works on windows. --snip-- I've successfully used (what I think is) Sturla Molden's shmem_as_ndarray as outline here [1] and here [2] for these purposes. Yeah, i saw that code too. My implementation is even more lax, but easier to use. It sends arrays by memory reference to subprocesses. Dangerous: yes, effective: very. It would be nice if we could stamp out some good effective patterns using multiprocessing and include them with numpy. The best solution is probably a parallel_for function: def parallel_for(func, inherit_args, iterable): ... Where func should be def func(inherit_args, item): ... And parallel_for makes sure inherit_args are viewable as a class shared() with writable shared memory. Erik ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Rolling window (moving average, moving std, and more)
Hi, Implementing moving average, moving std and other functions working over rolling windows using python for loops are slow. This is a effective stride trick I learned from Keith Goodman's kwgood...@gmail.com Bottleneck code but generalized into arrays of any dimension. This trick allows the loop to be performed in C code and in the future hopefully using multiple cores. import numpy as np def rolling_window(a, window): Make an ndarray with a rolling window of the last dimension Parameters -- a : array_like Array to add rolling window to window : int Size of rolling window Returns --- Array that is a view of the original array with a added dimension of size w. Examples x=np.arange(10).reshape((2,5)) rolling_window(x, 3) array([[[0, 1, 2], [1, 2, 3], [2, 3, 4]], [[5, 6, 7], [6, 7, 8], [7, 8, 9]]]) Calculate rolling mean of last dimension: np.mean(rolling_window(x, 3), -1) array([[ 1., 2., 3.], [ 6., 7., 8.]]) if window 1: raise ValueError, `window` must be at least 1. if window a.shape[-1]: raise ValueError, `window` is too long. shape = a.shape[:-1] + (a.shape[-1] - window + 1, window) strides = a.strides + (a.strides[-1],) return np.lib.stride_tricks.as_strided(a, shape=shape, strides=strides) Using np.swapaxes(-1, axis) rolling aggregations over any axis can be computed. I submitted a pull request to add this to the stride_tricks module. Erik ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion
[Numpy-discussion] Simple shared arrays
Hi, I was trying to parallelize some algorithms and needed a writable array shared between processes. It turned out to be quite simple and gave a nice speed up almost linear in number of cores. Of course you need to know what you are doing to avoid segfaults and such. But I still think something like this should be included with NumPy for power users. This works by inheriting anonymous mmaped memory. Not sure if this works on windows. import numpy as np import multiprocessing as mp class shared(np.ndarray): Shared writable array def __new__(subtype, shape, interface=None): size = np.prod(shape) if interface == None: buffer = mp.RawArray('d', size) self = np.ndarray.__new__(subtype, shape, float, buffer) else: class Dummy(object): pass buffer = Dummy() buffer.__array_interface__ = interface a = np.asarray(buffer) self = np.ndarray.__new__(subtype, shape=a.shape, buffer=a) return self def __reduce_ex__(self, protocol): return shared, (self.shape, self.__array_interface__) def __reduce__(self): return __reduce_ex__(self, 0) Also see attached file for example usage. Erik #!/usr/bin/env python import numpy as np import multiprocessing as mp class shared(np.ndarray): Shared writable array def __new__(subtype, shape, interface=None): size = np.prod(shape) if interface == None: buffer = mp.RawArray('d', size) self = np.ndarray.__new__(subtype, shape, float, buffer) else: class Dummy(object): pass buffer = Dummy() buffer.__array_interface__ = interface a = np.asarray(buffer) self = np.ndarray.__new__(subtype, shape=a.shape, buffer=a) return self def __reduce_ex__(self, protocol): return shared, (self.shape, self.__array_interface__) def __reduce__(self): return __reduce_ex__(self, 0) def poolmap(func, iterable, chunksize=None): p = mp.Pool() return p.map(func, iterable, chunksize) def std((a,o,i)): o[i] = np.std(a[i]) def parallel_std(a): o = shared(a.shape[0]) poolmap(std, [(a, o, i) for i in range(a.shape[0])]) return o def main(): a = shared((1000,1)) a[:,:] = np.random.rand(1000,1) import timeit print timeit.timeit(lambda: np.std(a,1), number=100)/100 print timeit.timeit(lambda: parallel_std(a), number=100)/100 #parallel_std(a) #np.std(a,1) if __name__ == __main__: main() ___ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion