Daniel Malter <daniel <at> umd.edu> writes: > > id<-rep(c(1:100),each=2) > obs<-rep(c(0:1),100) > d<-rep(sample(c(-1,1),100,replace=T),each=2) > base.happy<-rep(rnorm(100),each=2) > happy<-base.happy+1.5*d*obs+rnorm(200) > > data<-data.frame(id,obs,d,happy) > > > I am statistically confused tonight. When the assumptions to a random > > effects estimator are warranted, random effects should be the more > > efficient estimator than the fixed effects estimator because it uses fewer > > degrees of freedom (estimating just the variance parameter of the normal > > rather than using one df for each included fixed effect, I thought). > > However, I don't find this to be the case in this simulated example. > > > > For the sake of the example, assume you measure subjects' happiness before > > exposing them to a happy or sad movie, and then you measure their > > happiness again after watching the movie. Here, "id" marks the subject, > > "obs" marks the pre- and post-treatment observations, "d" is the treatment > > indicator (whether the subject watched the happy or sad movie), > > "base.happy" is the ~N(0,1)-distributed individual effect a(i), happy is > > the measured happiness for each subject pre- and post-treatment, > > respectively, and the error term u(i,t) is also distributed ~N(0,1). > >
> > # Now run the random and fixed effects models > library(lme4) > reg.re<-lmer(happy~factor(obs)*factor(d)+(1|id)) > > reg.fe1<-lm(happy~factor(id)+factor(obs)*factor(d)) > summary(reg.fe1) > > library(plm) > reg.fe2<-plm(happy~factor(obs)*factor(d),index=c('id','obs'), > model="within",data=data) > summary(reg.fe2) > > > > I am confused why FE and RE models are virtually equally efficient in this > > case. Can somebody lift my confusion? > > I have a couple of questions/suggestions: (1) this would probably get more appropriate attention on the r-sig-mixed-models list (I'm not forwarding it there myself because I'm answering via Gmane and I'm lazy). [When/if you forward it, do mention that you're cross-posting from R-help on purpose ...] (2) Can you be more specific about your metric of efficiency in this case? Are you referring to the fact the estimates / standard errors etc. are identical? lm: Estimate Std. Error t value Pr(>|t|) factor(obs)1:factor(d)1 3.39763 0.25200 13.483 < 2e-16 *** plm: Estimate Std. Error t-value Pr(>|t|) factor(obs)1:factor(d)1 3.39763 0.25200 13.4826 < 2.2e-16 *** (3) I think in this case that the random-effects test, when estimated by REML, is equivalent to a (pooled) t-test on (happy(after)-happy(before)) between the d=-1 and d=1 groups ... ? Is that useful? library(plyr) diffdata <- ddply(data,"id", function(x) data.frame(d=x$d[1],dhappy=diff(x$happy))) with(diffdata,t.test(dhappy[d==1],dhappy[d==-1],var.equal=TRUE)) data: dhappy[d == 1] and dhappy[d == -1] t = 13.4826, df = 98, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 3.897713 2.897540 sample estimates: mean of x mean of y 1.784575 -1.613051 Note the same t-statistic here. I think that, in this balanced case, there is no difference ... ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.