On Fri, Dec 31, 2010 at 8:21 AM, Lev Givon <[email protected]> wrote: > Received from Erik Rigtorp on Fri, Dec 31, 2010 at 08:52:53AM EST: >> Hi, >> >> I just send a pull request for some faster NaN functions, >> https://github.com/rigtorp/numpy. >> >> I implemented the following generalized ufuncs: nansum(), nancumsum(), >> nanmean(), nanstd() and for fun mean() and std(). It turns out that >> the generalized ufunc mean() and std() is faster than the current >> numpy functions. I'm also going to add nanprod(), nancumprod(), >> nanmax(), nanmin(), nanargmax(), nanargmin(). >> >> The current implementation is not optimized in any way and there are >> probably some speedups possible. >> >> I hope we can get this into numpy 2.0, me and people around me seems >> to have a need for these functions. >> >> Erik >> _______________________________________________ >> NumPy-Discussion mailing list >> [email protected] >> http://mail.scipy.org/mailman/listinfo/numpy-discussion >> > > How does this compare to Bottleneck? > > http://pypi.python.org/pypi/Bottleneck/
I had all sorts of problems with ABI differences (this is the first time I've tried numpy 2.0). So I couldn't get ipython, etc to work with Erik's new nan functions. That's why my speed comparison below might be hard to follow and only tests one example. For timing I used bottleneck's autotimeit function: >>> from bottleneck.benchmark.autotimeit import autotimeit First Erik's new nanmean: >>> stmt = "nanmean2(a.flat)" >>> setup = "import numpy as np; from numpy.core.umath_tests import nanmean as >>> nanmean2; rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)" >>> autotimeit(stmt, setup) 5.1356482505798338e-05 Bottleneck's low level nanmean: >> stmt = "nanmean(a)" >> setup = "import numpy as np; from bottleneck.func import >> nanmean_2d_float64_axisNone as nanmean; rs=np.random.RandomState([1,2,3]); a >> = rs.rand(100,100)" >> autotimeit(stmt, setup) 1.5422070026397704e-05 Bottleneck's high level nanmean: >> setup = "import numpy as np; from bottleneck.func import nanmean; >> rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)" >> autotimeit(stmt, setup) 1.7850480079650879e-05 Numpy's mean: >> setup = "import numpy as np; from numpy import mean; >> rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)" >> stmt = "mean(a)" >> autotimeit(stmt, setup) 1.6718170642852782e-05 Scipy's nanmean: >> setup = "import numpy as np; from scipy.stats import nanmean; >> rs=np.random.RandomState([1,2,3]); a = rs.rand(100,100)" >> stmt = "nanmean(a)" >> autotimeit(stmt, setup) 0.00024667191505432128 The tests above should be repeated for arrays that contain NaNs, and for different array sizes and different axes. Bottleneck's benchmark suite can be modified to do all that but I can't import Erik's new numpy and bottleneck at the same time at the moment. _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
