On Sat, Oct 17, 2009 at 8:36 AM, per freem <perfr...@gmail.com> wrote: > hi all, > > in my code, i use the function 'logsumexp' from scipy.maxentropy a > lot. as far as i can tell, this function has no vectorized version > that works on an m-x-n matrix. i might be doing something wrong here, > but i found that this function can run extremely slowly if used as > follows: i have an array of log probability vectors, such that each > column sums to one. i want to simply iterate over each column and > renormalize it, using exp(col - logsumexp(col)). here is the code that > i used to profile this operation: > > from scipy import * > from numpy import * > from numpy.random.mtrand import dirichlet > from scipy.maxentropy import logsumexp > import time > > # build an array of probability vectors. each column represents a > probability vector. > num_vectors = 1000000 > log_prob_vectors = transpose(log(dirichlet([1, 1, 1], num_vectors))) > # now renormalize each column, using logsumexp > norm_prob_vectors = [] > t1 = time.time() > for n in range(num_vectors): > norm_p = exp(log_prob_vectors[:, n] - logsumexp(log_prob_vectors[:, n])) > norm_prob_vectors.append(norm_p) > t2 = time.time() > norm_prob_vectors = array(norm_prob_vectors) > print "logsumexp renormalization (%d many times) took %s seconds." > %(num_vectors, str(t2-t1)) > > i found that even with only 100,000 elements, this code takes about 5 seconds: > > logsumexp renormalization (100000 many times) took 5.07085394859 seconds. > > with 1 million elements, it becomes prohibitively slow: > > logsumexp renormalization (1000000 many times) took 70.7815010548 seconds. > > is there a way to speed this up? most vectorized operations that work > on matrices in numpy/scipy are incredibly fast and it seems like a > vectorized version of logsumexp should be near instant on this scale. > is there a way to rewrite the above snippet so that it's faster? > > thanks very much for your help.
Here's logsumexp from scipy: def logsumexp(a): a = asarray(a) a_max = a.max() return a_max + log((exp(a-a_max)).sum()) Would this work: def logsumexp2(a): a = asarray(a) a_max = a.max(axis=0) return a_max + log((exp(a-a_max)).sum(axis=0)) ? _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion