On Mon, Nov 22, 2010 at 6:05 AM, Hagen Fürstenau <[email protected]> wrote: >> ISTM that this elementary functionality deserves an implementation >> that's as fast as it can be. > > To substantiate this, I just wrote a simple implementation of > "categorical" in "numpy/random/mtrand.pyx" and it's more than 8x faster > than your version for multiple samples of the same distribution and more > than 3x faster than using "multinomial(1, ...)" for multiple samples of > different distributions (each time tested with 1000 samples drawn from > distributions over 1000 categories). > > I can provide it as a patch if there's any interest.
Can you compare the speed of your cython solution with the version of Chuck -- For instance, weight 0..3 by 1..4, then In [14]: w = arange(1,5) In [15]: p = cumsum(w)/float(w.sum()) In [16]: bincount(p.searchsorted(random(1000000)))/1e6 Out[16]: array([ 0.100336, 0.200382, 0.299132, 0.40015 ]) ------------- from numpy mailing list thread "Weighted random integers", sep. 10 Using searchsorted hat roughly a 10 times speedup compared to my multinomial version Josef > > - Hagen > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
