Dear all,

I am somewhat late to the party (was traveling until yesterday), but would also like to add my two cents to this discussion. Especially as afex was mentioned here.

afex::mixed() indeed essentially does what is suggested here: It creates numerical variables from your factors and fits them in a sequential manner such that you get tests of all effects. To get the likelihood ratio tests you want you need to set method = "LRT" in the call to mixed (the default method, "KR", uses Kenward-Roger for p-values which is not appropriate for GLMMs, "PB" can also be used for GLMMs, which is parametric-bootstrap).

In contrast to Dale's suggestion mixed per default uses R built in factor types, specifically contr.sum() (i.e., sum-to-zero) contrasts which only use 1 and -1 as values. To create the variables it then uses model.matrix() and works from there.

I am not quite sure I agree with Daniel's point. The described procedure tests the NULL hypothesis of an effect at the (unweighted) mean of the other effects. I do not see how this should be nonsensical. In a factorial design this amounts to assuming that the other factors have no effect (i.e., there are no deviations from the unweighted grand mean produced by the levels of other factors).

Finally, as many have noted, the problem is that one cannot remove lower order effects in R via the formula interface. A quick example (using lm) can be found here: http://stackoverflow.com/q/11335923/289572 This behavior unfortunately generalizes to all interaction orders and and also model types (lm, lmer, glmer, ...). This behavior is also the reason why I developed afex:mixed(), to solve exactly this problems. If there are other modeling functions which are as useful as lmer or glmer, please let me know so I can make sure to also add the automatic testing for those functions.

In contrast to Dale I believe that the probability of committing an error when making the numeric variables by hand is higher then letting an algorithm do that.

Cheers,
Henrik



Am 19.09.2016 um 13:56 schrieb Dale Barr:

Hi Rachel,

I don't recommend using the built in 'factor' type variables when using
R linear model functions, because weird things happen when you try to
drop things out of a model that includes interactions.  It is better to
create your own numeric predictors, e.g.:

## create centered predictor for listener
## NB: replace 'somelevel' with name of one of levels of your var
SoleTrain$L <- (SoleTrain$Listener == "somelevel") -
  mean(SoleTrain$Listener == "somelevel")

## create 2 centered predictors for SyntaxType
## replace 'level1' and 'level2' with names of 2 levels of your var
SoleTrain$ST1 <- (SoleTrain$SyntaxType == "level1") -
  mean(SoleTrain$SyntaxType == "level1")

SoleTrain$ST2 <- (SoleTrain$SyntaxType == "level2") -
  mean(SoleTrain$SyntaxType == "level2")

These should give you values of -.5, +.5 for new predictor L, and
something like -.33 and +.67 for each of your two new ST1 and ST2
variables (depending on how balanced your data is).  Check that you are
getting (approximately) the right values.  The centering ensures that
you are looking at main rather than simple effects.  You can then fit
your model as:

mod <- glmer(Target_E2_pref ~ L * (ST1 + ST2) +
  (1 + L * (ST1 + ST2) | Subject) + (1 + L | Picture),
  data = SoleTrain, family = binomial, verbose = TRUE,
  control = glmerControl(optCtrl = list(maxfun = 20000))

and drop out terms as needed (you might want to look at the update()
function which makes this easier).

Alternative solution: check out the afex package, which (I think) can do
the factor coding and LRTs for you.

-D

Rachel Ostrand writes:

Hi everyone,

I'm having trouble with some 2-factor glmer models that I'm trying to run,
such that the model with one of the main effects removed is coming out
identical to the full model. Some colleagues suggested that this might be
due to the coding of my factors, specifically because I have a factor that
has 3 levels, and that one needs to be treated differently, but I'm not
sure how - or why - to do that.

Brief summary of my data:
-My DV (called Target_E2_pref) is a binary categorical variable.
-There are two categorical IVs: Listener (2 levels) and SyntaxType (3
levels).
-Listener varies by both subject and item (i.e., picture); SyntaxType only
varies by subject.

When I dummy coded my variables using contr.treatment(), the model with the
main effect of Listener removed from the fixed effects comes out identical
to the full model:

SoleTrain = read.table(paste(path, "SoleTrain.dat", sep=""), header=T)
SoleTrain$Listener.f = factor(SoleTrain$Listener, labels=c("E1", "E2"))
contrasts(SoleTrain$Listener.f) = contr.treatment(2)
SoleTrain$SyntaxType.f = factor(SoleTrain$SyntaxType,
labels=c("Transitive", "Locative", "Dative"))
contrasts(SoleTrain$SyntaxType.f) = contr.treatment(3)

# Create full model:
SoleTrain.full<- glmer(Target_E2_pref ~ Listener.f*SyntaxType.f + (1 +
Listener.f*SyntaxType.f|Subject) + (1 + Listener.f|Picture), data =
SoleTrain, family = binomial, verbose=T,
control=glmerControl(optCtrl=list(maxfun=20000)))

# Create model with main effect of Listener removed:
SoleTrain.noListener<- glmer(Target_E2_pref ~ SyntaxType.f +
Listener.f:SyntaxType.f + (1 + Listener.f*SyntaxType.f|Subject) + (1 +
Listener.f|Picture), data = SoleTrain, family = binomial, verbose=T,
control=glmerControl(optCtrl=list(maxfun=20000)))

anova(SoleTrain.full, SoleTrain.noListener)
Data: SoleTrain
Models:
SoleTrain.full: Target_E2_pref ~ Listener.f * SyntaxType.f + (1 +
Listener.f * SyntaxType.f | Subject) + (1 + Listener.f | Picture)
SoleTrain.noListener: Target_E2_pref ~ SyntaxType.f +
Listener.f:SyntaxType.f + (1 + Listener.f * SyntaxType.f | Subject) + (1 +
Listener.f | Picture)
                     Df    AIC    BIC  logLik deviance Chisq Chi Df
Pr(>Chisq)
SoleTrain.full       30 2732.5 2908.5 -1336.2   2672.5

SoleTrain.noListener 30 2732.5 2908.5 -1336.2   2672.5     0      0
 1

However, I don't have this problem when I test for the main effect of
SyntaxType, and remove the SyntaxType.f factor from the fixed effects.
(That is, this produces a different model than the full model.)

Someone suggested that Helmert coding was better for factors with more than
two levels, so I tried running the same models except with Helmert coding
[contrasts(SoleTrain$SyntaxType.f) = contr.helmert(3)], but the models come
out identical to the way they do with dummy coding. So why does the model
with the main effect of Listener removed the same as the model with the
main effect of Listener retained?

Any suggestions as to what I'm doing wrong?

Thanks!
Rachel




Reply via email to