hi all, in my code, i use the function 'logsumexp' from scipy.maxentropy a lot. as far as i can tell, this function has no vectorized version that works on an m-x-n matrix. i might be doing something wrong here, but i found that this function can run extremely slowly if used as follows: i have an array of log probability vectors, such that each column sums to one. i want to simply iterate over each column and renormalize it, using exp(col - logsumexp(col)). here is the code that i used to profile this operation:
from scipy import * from numpy import * from numpy.random.mtrand import dirichlet from scipy.maxentropy import logsumexp import time # build an array of probability vectors. each column represents a probability vector. num_vectors = 1000000 log_prob_vectors = transpose(log(dirichlet([1, 1, 1], num_vectors))) # now renormalize each column, using logsumexp norm_prob_vectors = [] t1 = time.time() for n in range(num_vectors): norm_p = exp(log_prob_vectors[:, n] - logsumexp(log_prob_vectors[:, n])) norm_prob_vectors.append(norm_p) t2 = time.time() norm_prob_vectors = array(norm_prob_vectors) print "logsumexp renormalization (%d many times) took %s seconds." %(num_vectors, str(t2-t1)) i found that even with only 100,000 elements, this code takes about 5 seconds: logsumexp renormalization (100000 many times) took 5.07085394859 seconds. with 1 million elements, it becomes prohibitively slow: logsumexp renormalization (1000000 many times) took 70.7815010548 seconds. is there a way to speed this up? most vectorized operations that work on matrices in numpy/scipy are incredibly fast and it seems like a vectorized version of logsumexp should be near instant on this scale. is there a way to rewrite the above snippet so that it's faster? thanks very much for your help. _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion