On 3/22/2006 10:08 AM, Peter Dalgaard wrote: > Duncan Murdoch <[EMAIL PROTECTED]> writes: > >> On 3/22/2006 3:52 AM, [EMAIL PROTECTED] wrote: >> >>>>>> "cspark" == cspark <[EMAIL PROTECTED]> >> >>>>>> on Wed, 22 Mar 2006 05:52:13 +0100 (CET) writes: >> > >> > cspark> Full_Name: Chanseok Park Version: R 2.2.1 OS: RedHat >> > cspark> EL4 Submission from: (NULL) (130.127.112.89) >> > >> > >> > >> > cspark> pbinom(any negative value, size, prob) should be >> > cspark> zero. But I got the following results. I mean, if >> > cspark> a negative value is close to zero, then pbinom() >> > cspark> calculate pbinom(0, size, prob). >> > >> > >> pbinom( -2.220446e-22, 3,.1) >> > [1] 0.729 >> > >> pbinom( -2.220446e-8, 3,.1) >> > [1] 0.729 >> > >> pbinom( -2.220446e-7, 3,.1) >> > [1] 0 >> > >> > Yes, all the [dp]* functions which are discrete with mass on the >> > integers only, do *round* their 'x' to integers. >> > >> > I could well argue that the current behavior is *not* a bug, >> > since we do treat "x close to integer" as integer, and hence >> > pbinom(eps, size, prob) with eps "very close to 0" should give >> > pbinom(0, size, prob) >> > as it now does. >> > >> > However, for esthetical reasons, >> > I agree that we should test for "< 0" first (and give 0 then) and only >> > round otherwise. I'll change this for R-devel (i.e. R 2.3.0 in >> > about a month). >> > >> > cspark> dbinom() also behaves similarly. >> > >> > yes, similarly, but differently. >> > I have changed it (for R-devel) as well, to behave the same as >> > others d*() , e.g., dpois(), dnbinom() do. >> >> Martin, your description makes it sound as though dbinom(0.3, size, >> prob) would give the same answer as dbinom(0, size, prob), whereas it >> actually gives 0 with a warning, as documented in ?dbinom. The d* >> functions only round near-integers to integers, where it looks as though >> near means within 1E-7. The p* functions round near integers to >> integers, and truncate others to the integer below. > > Well, the p-functions are constant on the intervals between > integers...
Not quite: they're constant on intervals (n - 1e-7, n+1 - 1e-7), for integers n. Since Martin's change, this is not true for n=0. (Or, did you refer to the lack of a warning? One point > could be that cumulative p.d.f.s extends naturally to non-integers, > whereas densities don't really extend, since they are defined with > respect to counting measure on the integers.) I wasn't complaining about the behaviour here, I was just clarifying Martin's description of it, when he said that "all the [dp]* functions which are discrete with mass on the integers only, do *round* their 'x' to integers". > >> I suppose the reason for this behaviour is to protect against rounding >> error giving nonsense results; I'm not sure that's a great idea, but if >> we do it, should we really be handling 0 differently? > > Most of these round-near-integer issues were spurred by real > programming problems. It is somewhat hard to come up with a problem > that leads you generate a binomial variate value with "floating point > noise", but I'm quite sure that we'll be reminded if we try to change > it... (One potential issue is back-calculation to counts from relative > frequencies). Again, I wasn't suggesting we change the general +/- 1E-7 behaviour (though it should be documented to avoid bug reports like this one), but I'm worried about having zero as a special case. This will break relations such as dbinom(x, n, 0.5) == dbinom(n-x, n, 0.5) (in the case where x is n+epsilon or -epsilon, for small enough epsilon). Is it really desirable to break the symmetry like this? Duncan Murdoch ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel