On Mon, Nov 22, 2010 at 1:26 PM, Keith Goodman <[email protected]> wrote: > On Mon, Nov 22, 2010 at 9:03 AM, Keith Goodman <[email protected]> wrote: > >> @cython.boundscheck(False) >> @cython.wraparound(False) >> def nanstd_twopass(np.ndarray[np.float64_t, ndim=1] a, int ddof): >> "nanstd of 1d numpy array with dtype=np.float64 along axis=0." >> cdef Py_ssize_t i >> cdef int a0 = a.shape[0], count = 0 >> cdef np.float64_t asum = 0, a2sum=0, amean, ai, da >> for i in range(a0): >> ai = a[i] >> if ai == ai: >> asum += ai >> count += 1 >> if count > 0: >> amean = asum / count >> asum = 0 >> for i in range(a0): >> ai = a[i] >> if ai == ai: >> da = ai - amean >> asum += da >> a2sum += (da * da) >> asum = asum * asum >> return sqrt((a2sum - asum / count) / (count - ddof)) >> else: >> return np.float64(NAN) > > This is 5% faster: > > @cython.boundscheck(False) > @cython.wraparound(False) > def nanstd_1d_float64_axis0_2(np.ndarray[np.float64_t, ndim=1] a, int ddof): > "nanstd of 1d numpy array with dtype=np.float64 along axis=0." > cdef Py_ssize_t i > cdef int a0 = a.shape[0], count = 0 > cdef np.float64_t asum = 0, amean, ai > for i in range(a0): > ai = a[i] > if ai == ai: > asum += ai > count += 1 > if count > 0: > amean = asum / count > asum = 0 > for i in range(a0): > ai = a[i] > if ai == ai: > ai -= amean > asum += (ai * ai) > return sqrt(asum / (count - ddof)) > else: > return np.float64(NAN)
I think it would be better to write nanvar instead of nanstd and take the square root only in a delegating nanstd, instead of the other way around. (Also a change that should be made in scipy.stats) Josef > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
