I have a large dataset where each Subject answered seven similar
Items, which are binary yes/no questions. So I've always used Subject
and Item random effects in my models, fit with lmer(), e.g.:
model<-lmer(Response~Race+Gender+...+(1|Subject_ID)+(1|
Item_ID),data,binomial)
But I recently realized something. Most of the variables that I've
tested as fixed effects are properties of the subject (e.g. Race,
Gender, etc.). Is it correct to be using a random effect Subject that
is nested within (partially-crossed) fixed effects like Gender and
Race? - I hope I'm using the terminology correctly.
So today, I accidentally ran a model without the Subject random
effect, and the fixed effect of Race was significant for the first
time. With the Subject effect included, Race is not significant. This
also happens if Race is treated as random, though the effect is
smaller then. The following table shows the various pairs of model
fits, from somers2(), and the p-values given by anova().
Somers' Dxy
p from anova() random Subject term no
Subject term
Somers' Dxy
no Race term 0.8487
0.4096
vs.
0.30 0.00064
fixed Race term 0.8483
0.4332
no Race term 0.8487
0.4096
vs.
0.96 0.0047
random Race term 0.8486
0.4334
Adding the Subject effect always highly improves the fit of the
model, so I would certainly want to keep it. But if there is a real
effect of Race, why does adding the Subject effect make it go away?
I thought the Subject random effect would be a sort of residual
subject effect, once everything else was accounted for by other
subject properties (some of which do remain significant with Subject
in there as well).
This must be a common scenario, since people are interested in
inherent properties of subjects, yet also try to model and 'factor
out' the random individual variation between people. I'm simply not
very familiar with the relevant literature, and I hope someone here
can help.
Thank you,
Daniel
P.S. Also, why does treating Race as a random factor have (very
slightly) more of an effect on the Somers' Dxy, while judging by anova
() it's "more significant" as a fixed factor?
[[alternative HTML version deleted]]
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.