G'day Martin (and "listeners"), On Fri, 7 Mar 2008 15:01:26 +0100 Martin Maechler <[EMAIL PROTECTED]> wrote:
[...] > >> If you feel like finding another elegant patch... > > BAT> Well, elegance is in the eye of the beholder. :-) > > BAT> I attach two patches. One that adds warning messages at > BAT> the other places where NAs can be generated. > > ok. The result is very simple ``hence elegant''. > > But actually, part of the changed behavior may be considered > undesirable: > > rnorm(2, mean = NA) > > which gives two NaN's would now produce a warning, > where I could argue that > 'arithmetic with NAs should give NAs without a warning' > since > 1:2 + NA > also gives NAs without a warning. > > So we could argue that a warning should *only* be produced in a > case where the parameters of the distribution are not NA. > > What do others (particularly R-core !) think? I can agree with that point of view. But then, should a warning not be issued only if one of the parameters of the distribution is NA, or should it also not be issued if any non-finite parameter is encountered? After all, > 1:2 + Inf [1] Inf Inf does not create any warning either. In that case, a patch as the attached should do, it checks all parameters for finiteness and then checks whether the generated number is not finite. Thus, with the patch applied I see the following behaviour: > rnorm(2, mean=NA) [1] NaN NaN > rnorm(5, mean=c(0,Inf, -Inf, NA, NaN)) [1] 1.897874 NaN NaN NaN NaN > rbinom(2, size=20, p=1.2) [1] NaN NaN Warning message: In rbinom(2, size = 20, p = 1.2) : NAs produced > rexp(2, rate=-2) [1] NaN NaN Warning message: In rexp(2, rate = -2) : NAs produced Without the patch: > rnorm(2, mean=NA) [1] NaN NaN > rnorm(5, mean=c(0,Inf, -Inf, NA, NaN)) [1] -0.1719657 NaN NaN NaN NaN > rbinom(2, size=20, p=1.2) [1] NaN NaN > rexp(2, rate=-2) [1] NaN NaN Warning message: In rexp(2, rate = -2) : NAs produced On my machine, "make check FORCE=FORCE" passes with this patch. [...] > For now, I will ignore this second patch. > > - it does bloat the code slightly (as you conceded) Fair enough. :) I also proved my point that more complicated code is harder to maintain. In the two parameter case I was testing twice na for being one instead of testing na and nb....... [...] > but most importantly: > > - we have no idea if the speedup (when <Simple> is TRUE) > is in the order of 10%, 1% or 0.1% > > My guess would be '0.1%' for rnorm(), and that would > definitely not warrant the extra check. I would not bet against this. Probably even with all the additional checks for finiteness of parameters there would be not much speed difference. The question might be whether you want to replace the (new) R_FINITE()'s rather by ISNA()'s (or something else). One could also discuss in which order the checks should be made (first generated number then parameters of distribution or vice versa). But I will leave this to R-core to decide. :) > >> Thank you Berwin, for your contribution! > > and thanks again! Still my pleasure. :) Cheers, Berwin
Index: src/main/random.c =================================================================== --- src/main/random.c (revision 44693) +++ src/main/random.c (working copy) @@ -44,7 +44,7 @@ for (i = 0; i < n; i++) { ai = a[i % na]; x[i] = f(ai); - if (!R_FINITE(x[i])) naflag = 1; + if (!R_FINITE(x[i]) && R_FINITE(ai)) naflag = 1; } return(naflag); } @@ -81,6 +81,7 @@ if (na < 1) { for (i = 0; i < n; i++) REAL(x)[i] = NA_REAL; + warning(_("NAs produced")); } else { PROTECT(a = coerceVector(CADR(args), REALSXP)); @@ -116,14 +117,14 @@ ai = a[i % na]; bi = b[i % nb]; x[i] = f(ai, bi); - if (!R_FINITE(x[i])) naflag = 1; + if (!R_FINITE(x[i]) && R_FINITE(ai) && R_FINITE(bi)) naflag = 1; } return(naflag); } #define RAND2(num,name) \ case num: \ - random2(name, REAL(a), na, REAL(b), nb, REAL(x), n); \ + naflag = random2(name, REAL(a), na, REAL(b), nb, REAL(x), n); \ break /* "do_random2" - random sampling from 2 parameter families. */ @@ -155,6 +156,7 @@ if (na < 1 || nb < 1) { for (i = 0; i < n; i++) REAL(x)[i] = NA_REAL; + warning(_("NAs produced")); } else { PROTECT(a = coerceVector(CADR(args), REALSXP)); @@ -200,14 +202,14 @@ bi = b[i % nb]; ci = c[i % nc]; x[i] = f(ai, bi, ci); - if (!R_FINITE(x[i])) naflag = TRUE; + if (!R_FINITE(x[i]) && R_FINITE(ai) && R_FINITE(bi) && R_FINITE(ci)) naflag = TRUE; } return(naflag); } #define RAND3(num,name) \ case num: \ - random3(name, REAL(a), na, REAL(b), nb, REAL(c), nc, REAL(x), n); \ + naflag = random3(name, REAL(a), na, REAL(b), nb, REAL(c), nc, REAL(x), n); \ break @@ -244,6 +246,7 @@ if (na < 1 || nb < 1 || nc < 1) { for (i = 0; i < n; i++) REAL(x)[i] = NA_REAL; + warning(_("NAs produced")); } else { PROTECT(a = coerceVector(a, REALSXP));
______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel