Re: [Numpy-discussion] categorical distributions

josef . pktd Mon, 22 Nov 2010 05:46:23 -0800

On Mon, Nov 22, 2010 at 6:05 AM, Hagen Fürstenau <[email protected]> wrote:
>> ISTM that this elementary functionality deserves an implementation
>> that's as fast as it can be.
>
> To substantiate this, I just wrote a simple implementation of
> "categorical" in "numpy/random/mtrand.pyx" and it's more than 8x faster
> than your version for multiple samples of the same distribution and more
> than 3x faster than using "multinomial(1, ...)" for multiple samples of
> different distributions (each time tested with 1000 samples drawn from
> distributions over 1000 categories).
>
> I can provide it as a patch if there's any interest.


Can you compare the speed of your cython solution with the version of Chuck

--
For instance, weight 0..3 by 1..4, then

In [14]: w = arange(1,5)

In [15]: p = cumsum(w)/float(w.sum())

In [16]: bincount(p.searchsorted(random(1000000)))/1e6
Out[16]: array([ 0.100336,  0.200382,  0.299132,  0.40015 ])

-------------
from numpy mailing list thread "Weighted random integers", sep. 10
Using searchsorted hat roughly a 10 times speedup compared to my
multinomial version

Josef


>
> - Hagen
>
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] categorical distributions

Reply via email to