On Thu, Apr 23, 2009 at 2:56 PM, <[email protected]> wrote: > On Thu, Apr 23, 2009 at 9:27 AM, Flavio Coelho <[email protected]> wrote: > > > > Hi, > > > > I stumbled upon something I think is a bug in scipy: > > > > In [4]: stats.randint(1.,15.).ppf([.1, > > .2,.3,.4,.5]) > > Out[4]: array([ 2., 3., 5., 6., 7.]) > > > > When you pass float arguments to stats.randint and then call the ppf > method, > > you get an array of floats, which clearly wrong. The rvs method doesn't > > display this bug so I think is a matter of poor type checking in the ppf > > implementation... > > > > I switched to using floats intentionally, to have correct handling of > inf and nans. and the argument checking is generic for all discrete > distributions and not special cased. Nans are converted to zero when > casting to integers, which is wrong and very confusing. inf raise an > exception. I prefer correct numbers to correct types. see examples > below
I understand. I couldn't find, however, any parameterizations which makes randint return INF or NAN. So it should be safe to make randint return integers. I tested the equivalent function (for Poisson) in R (qpois) and it also returns INF when quantile is 1. so no problem there. I found some weird behaviors with stats.poisson.ppf: In [34]: stats.poisson.ppf([0,1]) Out[34]: array([ -1., Inf]) #R returns [0,inf] In [35]: stats.poisson.ppf([0.,.1]) --------------------------------------------------------------------------- TypeError Traceback (most recent call last) /home/fccoelho/<ipython console> in <module>() /usr/lib/python2.6/dist-packages/scipy/stats/distributions.pyc in ppf(self, q, *args, **kwds) 4002 goodargs = argsreduce(cond, *((q,)+args+(loc,))) 4003 loc, goodargs = goodargs[-1], goodargs[:-1] -> 4004 place(output,cond,self._ppf(*goodargs) + loc) 4005 4006 if output.ndim == 0: TypeError: _ppf() takes exactly 3 arguments (2 given) I hope these bug reports help. I think handling infinity and zero is almost problematic, maybe the best we can do is to aim for predictable behavior (possibly matching the behavior of equivalent R functions) I think that if randint would return ints no matter what the type of the parameters, it would make the behavior of my code a lot more predictable. > > > If you don't have inf and nans you can cast them to int yourself. > > Josef > > >>> aint = np.zeros(5,dtype=int) > >>> aint[0]= np.nan > >>> aint > array([0, 0, 0, 0, 0]) > >>> aint[1]= np.inf > Traceback (most recent call last): > File "<pyshell#134>", line 1, in <module> > OverflowError: cannot convert float infinity to long > > >>> from scipy import stats > >>> stats.poisson.ppf(1,1) > inf > >>> stats.poisson.ppf(2,1) > nan > > >>> stats.poisson.ppf(1,1).astype(int) > -2147483648 > >>> aint[2] = stats.poisson.ppf(1,1) > Traceback (most recent call last): > File "<pyshell#140>", line 1, in <module> > OverflowError: cannot convert float infinity to long > _______________________________________________ > Numpy-discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > -- --- Flávio Codeço Coelho
_______________________________________________ Numpy-discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
