Re: [Rd] var/sd and NAs in R2.7.0

Simon Urbanek Fri, 16 May 2008 08:43:01 -0700

Robert,

this was discussed before:
https://stat.ethz.ch/pipermail/r-devel/2007-December/047594.html


and it *is* mentioned in NEWS:

    o   co[rv](use = "complete.obs") now always gives an error if there
        are no complete cases: they used to give NA if
        method = "pearson" but an error for the other two methods.
        (Note that this is pretty arbitrary, but zero-length vectors
        always give an error so it is at least consistent.)

        cor(use="pair") used to give diagonal 1 even if the variable
        was completely missing for the rank methods but NA for the
        Pearson method: it now gives NA in all cases.

        cor(use="pair") for the rank methods gave a matrix result with
        dimensions > 0 even if one of the inputs had 0 columns.

[sd(..,na.rm=TRUE) -> cov(..,use="complete.obs")]

Cheers,
Simon


On May 16, 2008, at 11:19 AM, McGehee, Robert wrote:

I know I can get around this, I just would prefer that if R isbreaking

backwards compatibility, then it's intentional (maybe it is, I just
don't know). That is, I don't want to require my entire company to

upgrade to 2.7.0 just so I can deploy a fix here, and I'd prefer notto

check the argument list of var every time I use it.

if ("use" %in% names(formals(var)))
        var(x, na.rm=TRUE, use="p")
else
        var(x, na.rm=TRUE)


-----Original Message-----
From: Gabor Grothendieck [mailto:[EMAIL PROTECTED]
Sent: Friday, May 16, 2008 11:03 AM
To: McGehee, Robert
Cc: R-devel
Subject: Re: [Rd] var/sd and NAs in R2.7.0

Try

var(c(NA, NA, NA), use = "pairwise.complete.obs")


On Fri, May 16, 2008 at 10:56 AM, McGehee, Robert
<[EMAIL PROTECTED]> wrote:

Hello all,
I just upgraded to R 2.7.0 and found that the behavior of 'var' and

'sd'

have changed in the presence NAs (this wasn't explicit in the NEWS

file,

though I see it probably has to do with the change for cor/cov).

Anyway,

I just want to make sure that it was intentional to produce an error
when there was all NAs and na.rm=TRUE, rather than returning an NA

(like

R 2.6.2), or NaN (like the function 'mean' does). That is, isn't the
purpose of 'na.rm=TRUE' to, in part, suppress these error messages.

Specifically,

var(c(NA, NA, NA), na.rm=TRUE) # R2.6.2

[1] NA

var(c(NA, NA, NA), na.rm=TRUE) # R2.7.0

Error during wrapup: no complete observations in cov/cor

I think I can get the old behavior by setting use='p', but the 'sd'
function does not have a 'use' argument and I'd like not to get an

error

here. Anyway, I'm a fan of the old behavior (not producing an error),
but if there was a reason to change this when na.rm=TRUE, I would
request that the 'sd' function be updated to be able to revert to the
old behavior as well.

FYI: I 'apply' these functions to large matrices of stock return time
series with missing values, and don't want the whole calculation to

fail

just because I'm missing stock returns for one company.

Thanks,
Robert

Robert McGehee, CFA
Geode Capital Management, LLC
One Post Office Square, 28th Floor | Boston, MA | 02109
Tel: 617/392-8396    Fax:617/476-6389
mailto:[EMAIL PROTECTED]



This e-mail, and any attachments hereto, are intended

fo...{{dropped:11}}


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel


______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Re: [Rd] var/sd and NAs in R2.7.0

Reply via email to