David Cournapeau <david <at> ar.media.kyoto-u.ac.jp> writes:

> Still, it is indeed really slow for your case; when I fixed nanmean and
> co, I did not know much about numpy, I just wanted them to give the
> right answer :) I think this can be made faster, specially for your case
> (where the axis along which the median is computed is really small).
> 

I've found that if I just cut nans from the list and use regular numpy median,
it is quicker - 10 times slower than list median, rather than 35 times slower.
Could you just wire nanmedian to do it this way? The only difference is that on
an empty list, nanmedian gives nan, but median throws an IndexError.

Below is my profiling code with this change. Sample output:

$ ./arrayspeed3.py
list build time: 0.16
list median time: 0.08
array nanmedian time: 0.98

Peter

===

from numpy import *
from pylab import rand
from time import clock
from scipy.stats.stats import nanmedian

def my_median(vallist):
        num_vals = len(vallist)
        if num_vals == 0:
                return nan
        vallist.sort()
        if num_vals % 2 == 1: # odd
                index = (num_vals - 1) / 2
                return vallist[index]
        else: # even
                index = num_vals / 2
                return (vallist[index] + vallist[index - 1]) / 2

numtests = 100
testsize = 1000
pointlen = 3

t0 = clock()
natests = rand(numtests,testsize,pointlen)
# have to start with inf because list.remove(nan) doesn't remove nan
natests[natests > 0.9] = inf
tests = natests.tolist()
natests[natests==inf] = nan
for test in tests:
        for point in test:
                while inf in point:
                        point.remove(inf)
t1 = clock()
print "list build time:", t1-t0


allmedians = []
t0 = clock()
for test in tests:
        medians = [ my_median(x) for x in test ]
        allmedians.append(medians)
t1 = clock()
print "list median time:", t1-t0

t0 = clock()
namedians = []
for natest in natests:
        thismed = []
        for point in natest:
                maskpoint = point[negative(isnan(point))]
                if len(maskpoint) > 0:
                        med = median(maskpoint)
                else:
                        med = nan
                thismed.append(med)
        namedians.append(thismed)
t1 = clock()
print "array nanmedian time:", t1-t0




_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to