There is a minor error in my post from earlier today that I should
correct (see below):
On May 24, 2009, at 10:52 AM, Roger Levy wrote:
Dear Linda,
On May 20, 2009, at 3:34 AM, Linda Mortensen wrote:
Dear LanguageR users,
I'm trying to fit a mixed logit model using the lmer function in
the lme4 package. My question concerns the random effects part of
this model (i.e., the random effects for my subjects and items) and
how I decide between models that differ in the number of random
effect terms that are estimated.
First of all, in my assessment the problem of which random effects
terms to include in your model when the primary target of inference
is the fixed effects is still open.
So far, I have used two procedures:
1. For a given model, I remove a random effect term if it
correlates very strongly with either the intercept or any of the
other random effect terms. Eventually, I end up with a model in
which all correlations are modest.
This is an interesting idea, but I would emphasize two things:
1) it's important to distinguish between positive and negative
correlations. A strong negative correlation is telling you
something very important about your dataset. Imagine a word
recognition task where the response variable is correct answer and
the covariate x1 is word frequency. A strong negative correlation
between intercept and x1 is telling you that participants who answer
more correctly overall are less sensitive to word frequency, and
vice versa, and that this is a very reliable generalization. You
can see this in model log-likelihoods too: compare the two lmer
model fits below.
set.seed(9)
library(mvtnorm)
library(lme4)
k <- 10
n <- 1000
cl <- gl(k,1,n)
x1 <- runif(n)
sigma <- matrix(c(1,-1.8,-1.8,4),2,2)
b <- rmvnorm(k,mean=c(0,0),sigma)
eta <- b[cl,1] + b[cl,2]*x1
y <- rbinom(n,1,exp(eta)/ (1+exp(eta)))
lmer(y ~ 1 + (1 | cl),family="binomial")
lmer(y ~ 1 + (x1 | cl),family="binomial")
2) When you say "remove a term", what really would be justified is
if the random parameters for covariates x1 and x2 are correlated at
>0.99, create a third, "proxy" parameter x12=x1+x2, add x12 to the
random-effects structure, and drop x1 and x2. This would save you
two parameters at basically no modeling cost.
This proxy parameter x12 should be equal to x1+C*x2, for some value of
C which you could read off of the old model fit where x1 and x2 are
separate (divide the standard deviation of the random effect for x2 by
the standard deviation for x1).
2. I compare the quasi-log likelihood (logLik) values of a model
with a given random effect term (e.g. an interaction term, ... (1 +
a * b | sub) and of a model without that term (... (1 + a + b |
sub). If the logLik values are very similar (i.e., if the value is
not, or at least not much, smaller for the model without the term
than for the model with the term), I go for the former model.
This is OK, and more of the recommended practice (see Baayen et al.,
2008, for discussion with respect to linear mixed-effect models).
You can actually do a likelihood-ratio test, though with the dual
caveats that (a) Laplace-approximated log-likelihood is not true
loglikelihood; and (b) the test is conservative.
Is it acceptable to select a model on the basis of this comparison?
Or, when the logLik values are similar (which they usually are for
my models), should I instead look at the measures of likelihood
that take into account the number of parameters in a model when
evaluating its fit (i.e., AIC, BIC, deviance)? According to these
other measures, a simple model seems always to be better than a
more complex one, but if I want to rule out that my fixed effects
can be explained, in part, by random effects for subjects and
items, then a simple model (with few random effects) is not
necessarily better than a complex one, I would think.
Well, first of all the deviance is just -2*logLik. The AIC and BIC
are still dominated by log-likelihood too. And it's not always going
to be the case that the logLik will not be appreciably better for
more complex models -- see my above example. Finally, I'd agree
with you that it's better to be cautious and include the extra, more
complex terms if you want to be sure that you have a "real" fixed
effect.
Hope this helps.
Best
Roger
--
Roger Levy Email: [email protected]
Assistant Professor Phone: 858-534-7219
Department of Linguistics Fax: 858-534-4789
UC San Diego Web: http://ling.ucsd.edu/~rlevy
_______________________________________________
R-lang mailing list
[email protected]
http://pidgin.ucsd.edu/mailman/listinfo/r-lang
--
Roger Levy Email: [email protected]
Assistant Professor Phone: 858-534-7219
Department of Linguistics Fax: 858-534-4789
UC San Diego Web: http://ling.ucsd.edu/~rlevy
_______________________________________________
R-lang mailing list
[email protected]
http://pidgin.ucsd.edu/mailman/listinfo/r-lang