On Fri, Nov 19, 2010 at 12:10 PM, <[email protected]> wrote: > What's the speed advantage of nanny compared to np.nansum that you > have if the arrays are larger, say (1000,10) or (10000,100) axis=0 ?
Good point. In the small examples I showed so far maybe the speed up was all in overhead. Fortunately, that's not the case: >> arr = np.random.rand(1000, 1000) >> timeit np.nansum(arr) 100 loops, best of 3: 4.79 ms per loop >> timeit ny.nansum(arr) 1000 loops, best of 3: 1.53 ms per loop >> arr[arr > 0.5] = np.nan >> timeit np.nansum(arr) 10 loops, best of 3: 44.5 ms per loop >> timeit ny.nansum(arr) 100 loops, best of 3: 6.18 ms per loop >> timeit np.nansum(arr, axis=0) 10 loops, best of 3: 52.3 ms per loop >> timeit ny.nansum(arr, axis=0) 100 loops, best of 3: 12.2 ms per loop np.nansum makes a copy of the input array and makes a mask (another copy) and then uses the mask to set the NaNs to zero in the copy. So not only is nanny faster, but it uses less memory. _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
