Just to provide some closure on this thread, let me add two comments: 1. Doug's version of my sweep function:
diffid1 <- function(h, id) { id <- as.factor(id)[ , drop = TRUE] apply(as.matrix(h), 2, function(x) x - tapply(x, id, mean)[id]) } is far more elegant than my original, and works perfectly, but 2. I should have mentioned that proposed strategy gets the coefficient estimates right, however their standard errors need a degrees of freedom correction, which in the present instance is non-negligible -- sqrt(98/89) -- since the lm() step doesn't know that we have already estimated the fixed effects with the sweep operation. url: www.econ.uiuc.edu/~roger Roger Koenker email [EMAIL PROTECTED] Department of Economics vox: 217-333-4558 University of Illinois fax: 217-244-6678 Champaign, IL 61820 On May 5, 2007, at 7:16 PM, Douglas Bates wrote: > On 5/5/07, roger koenker <[EMAIL PROTECTED]> wrote: >> >> On May 5, 2007, at 3:14 PM, Douglas Bates wrote: >> > >> > As Roger indicated in another reply you should be able to obtain >> the >> > results you want by sweeping out the means of the groups from >> both x >> > and y. However, I tried Roger's function and a modified version >> that >> > I wrote and could not show this. I'm not sure what I am doing >> wrong. >> >> Doug, Isn't it just that you are generating a balanced factor and >> Ivo is >> generating an unbalanced one -- he wrote: > >> > fe = as.factor( as.integer( runif(100)*10 ) ); > >> the coefficient on x is the same.... > >> or, aarrgh, is it that you don't like the s.e. being wrong. I >> didn't notice >> this at first. But it shouldn't happen. I'll have to take another >> look at this. > > No, my mistake was much dumber than that. I was comparing the wrong > coefficient. For some reason I was comparing the coefficient for x in > the second fit to the Intercept from the first fit. > > I'm glad that it really is working and, yes, you are right, the > degrees of freedom are wrong in the second fit because the effect of > those 10 degrees of freedom are removed from the data before the model > is fit. > > >> > I enclose a transcript that shows that I can reproduce the >> result from >> > Roger's function but it doesn't do what either of us think it >> should. >> > BTW, I realize that the estimate for the Intercept should be >> zero in >> > this case. >> > >> > >> > >> >> now, with a few IQ points more, I would have looked at the lme >> >> function instead of the nlme function in library(nlme). [then >> >> again, I could understand stats a lot better with a few more IQ >> >> points.] I am reading the lme description now, but I still don't >> >> understand how to specify that I want to have dummies in my >> >> specification, plus the x variable, and that's it. I think I >> am not >> >> understanding the integration of fixed and random effects in >> the same >> >> R functions. >> >> >> >> thanks for pointing me at your lme4 library. on linux, version >> >> 2.5.0, I did >> >> R CMD INSTALL matrix*.tar.gz >> >> R CMD INSTALL lme4*.tar.gz >> >> and it installed painlessly. (I guess R install packages don't >> have >> >> knowledge of what they rely on; lme4 requires matrix, which >> the docs >> >> state, but having gotten this wrong, I didn't get an error. no >> big >> >> deal. I guess I am too used to automatic resolution of >> dependencies >> >> from linux installers these days that I did not expect this.) >> >> >> >> I now tried your specification: >> >> >> >> > library(lme4) >> >> Loading required package: Matrix >> >> Loading required package: lattice >> >> > lmer(y~x+(1|fe)) >> >> Linear mixed-effects model fit by REML >> >> Formula: y ~ x + (1 | fe) >> >> AIC BIC logLik MLdeviance REMLdeviance >> >> 282 290 -138 270 276 >> >> Random effects: >> >> Groups Name Variance Std.Dev. >> >> fe (Intercept) 0.000000000445 0.0000211 >> >> Residual 0.889548532468 0.9431588 >> >> number of obs: 100, groups: fe, 10 >> >> >> >> Fixed effects: >> >> Estimate Std. Error t value >> >> (Intercept) -0.0188 0.0943 -0.199 >> >> x 0.0528 0.0904 0.585 >> >> >> >> Correlation of Fixed Effects: >> >> (Intr) >> >> x -0.022 >> >> Warning messages: >> >> 1: Estimated variance for factor 'fe' is effectively zero >> >> in: `LMEoptimize<-`(`*tmp*`, value = list(maxIter = 200L, >> >> tolerance = >> >> 0.0000000149011611938477, >> >> 2: $ operator not defined for this S4 class, returning NULL in: x >> >> $symbolic.cor >> >> >> >> Without being a statistician, I can still determine that this >> is not >> >> the model I would like to work with. The coefficient is >> 0.0528, not >> >> 0.0232. (I am also not sure why I am getting these warning >> messages >> >> on my system, either, but I don't think it matters.) >> >> >> >> is there a simple way to get the equivalent specification for my >> >> smple >> >> model, using lmer or lme, which does not choke on huge data sets? >> >> >> >> regards, >> >> >> >> /ivo >> >> >> >> <Ivo_Rout.txt> >> > ______________________________________________ >> > R-help@stat.math.ethz.ch mailing list >> > https://stat.ethz.ch/mailman/listinfo/r-help >> > PLEASE do read the posting guide http://www.R-project.org/posting- >> > guide.html >> > and provide commented, minimal, self-contained, reproducible code. >> >> ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.