On Thu, 22 Jun 2006, [EMAIL PROTECTED] wrote:
>
> In the code below, fn1() and fn2() fail with the messages given in the 
> comments.
> Strangely, fn2() fails for all data sets I've tried except for those with 100
> rows.
<snip>
> fn1 <- function(model, data)
> {
>       w <- runif(nrow(data));
>       print(lm(model, data=data, weights=w));
> }
>
> fn2 <- function(model, data)
> {
>       print(lm(model, data=data, weights=runif(nrow(data))));
> }


This is the result of an interaction between a (IMO bad) design choice 
when lm and glm were first introduced in S-PLUS and a (IMO good) design 
choice more recently in R.

The bad design choice was that
   lm(model, data=data, weights=w)
is interpreted more like
   lm(model, data=data, weights=~w)

That is, as far as you can see from the outside, weights=w appears to be 
an ordinary argument passed by value but it is interpreted as if it were 
a reference by name to the data= argument.

This still wouldn't be too bad, except that if there is no element of 
data= called "w", lm() looks further. In S-PLUS it looks in the calling 
frame and then in the global workspace. In R it looks at the environment 
where the formula was defined.

Neither of these is necessarily what you expect, but people expect a wide 
range of incompatible things, so this isn't decisive.

There are at least two ways to get the result you want.  The simpler 
and cruder way is to make w a column of the data  frame. This is inefficient in 
memory if 
data is very large, and requires that you use a name that doesn't conflict 
with any variable that you already want in the model, eg.

   data$".weights."<-runif(nrow(data))
   lm(model, data=data,weights=.weights.)

The other approach is to set the environment of the formula to be the 
current environment. This will work as long as the formula doesn't refer 
to any variables in its original environment

    environment(model)<-environment()
    w<-runif(nrow(data))
    lm(model,data=data, weights=w)


> # But fn2() works if n=100

No, it just looks as though it does. I suspect you have a data frame 
called data, with 100 rows, in your workspace.

In a clean copy of R I get
> fn2(y ~ x, data=A);
Error in runif(n, min, max) : invalid arguments

        -thomas


Thomas Lumley                   Assoc. Professor, Biostatistics
[EMAIL PROTECTED]       University of Washington, Seattle

______________________________________________
R-devel@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-devel

Reply via email to