Re: [Numpy-discussion] categorical distributions

2010-11-23 Thread Hagen Fürstenau
> Can you compare the speed of your cython solution with the version of Chuck For multiple samples of the same distribution, it would do more or less the same as the "searchsorted" method, so I don't expect any improvement (except for being easier to find). For multiple samples of different distr

Re: [Numpy-discussion] categorical distributions

2010-11-22 Thread josef . pktd
On Mon, Nov 22, 2010 at 6:05 AM, Hagen Fürstenau wrote: >> ISTM that this elementary functionality deserves an implementation >> that's as fast as it can be. > > To substantiate this, I just wrote a simple implementation of > "categorical" in "numpy/random/mtrand.pyx" and it's more than 8x faster

Re: [Numpy-discussion] categorical distributions

2010-11-22 Thread Hagen Fürstenau
> ISTM that this elementary functionality deserves an implementation > that's as fast as it can be. To substantiate this, I just wrote a simple implementation of "categorical" in "numpy/random/mtrand.pyx" and it's more than 8x faster than your version for multiple samples of the same distribution

Re: [Numpy-discussion] categorical distributions

2010-11-22 Thread Hagen Fürstenau
>> but this is bound to be inefficient as soon as the vector of >> probabilities gets large, especially if you want to draw multiple samples. >> >> Have I overlooked something or should this be added? > > I think you misunderstand the point of multinomial distributions. I'm afraid the multiple sa

Re: [Numpy-discussion] categorical distributions

2010-11-22 Thread David Warde-Farley
On 2010-11-22, at 2:51 AM, Hagen Fürstenau wrote: > but this is bound to be inefficient as soon as the vector of > probabilities gets large, especially if you want to draw multiple samples. > > Have I overlooked something or should this be added? I think you misunderstand the point of multinomia

[Numpy-discussion] categorical distributions

2010-11-21 Thread Hagen Fürstenau
Hi, numpy doesn't seem to have a function for sampling from simple categorical distributions. The easiest solution I could come up with was something like >>> from numpy.random import multinomial >>> multinomial(1, [.5, .3, .2]).nonzero()[0][0] 1 but this is bound to be inefficient as soon as th