Paul Moore <pf_moore <at> yahoo.co.uk> writes: > Robert Kern <robert.kern <at> gmail.com> writes: >> On Thu, Sep 18, 2008 at 16:55, Paul Moore <pf_moore <at> >> yahoo.co.uk> wrote: >>> I want to generate a series of random samples, to do simulations >>> based >>> on them. Essentially, I want to be able to produce a SAMPLESIZE * N >>> matrix, where each row of N values consists of either >>> >>> 1. Integers between 1 and M (simulating M rolls of an N-sided >>> die), or >>> 2. A sample of N numbers between 1 and M without repeats (simulating >>> deals of N cards from an M-card deck). >>> >>> Example (1) is easy, numpy.random.random_integers(1, M, >>> (SAMPLESIZE, N)) >>> >>> But I can't find an obvious equivalent for (2). Am I missing >>> something >>> glaringly obvious? I'm using numpy - is there maybe something in >>> scipy I >>> should be looking at? >> >> numpy.array([(numpy.random.permutation(M) + 1)[:N] >> for i in range(SAMPLESIZE)]) >> > > Thanks. > > And yet, this takes over 70s and peaks at around 400M memory use, > whereas the > equivalent for (1) > > numpy.random.random_integers(1,M,(SAMPLESIZE,N)) > > takes less than half a second, and negligible working memory (both > end up > allocating an array of the same size, but your suggestion consumes > temporary > working memory - I suspect, but can't prove, that the time taken > comes from > memory allocations rather than computation.
It seems like numpy.random.permutation is pretty suboptimal in its speed. Here's a Python 1-liner that does the same thing (I think) but is a lot faster: a = 1+numpy.random.rand(M).argsort()[0:N-1] This still has the the problem that it generates a size N array to start with. But at least it is fast compared with permutation: >>> t1 = timeit.Timer("a = 1+numpy.random.permutation(M)[0:N-1]", "import numpy; M=1000; N=100") >>> t1.repeat(3,1000) [1.3792998790740967, 1.3387739658355713, 1.3446521759033203] >>> t2 = timeit.Timer("a = 1+numpy.random.rand(M).argsort()[0:N-1]", "import numpy; M=1000; N=100") >>> t2.repeat(3,1000) [0.17597699165344238, 0.16013598442077637, 0.16066503524780273] _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion