Say X is numeric. Y is a factor. You're testing the hypothesis that the 
unweighted average of the effects of the levels of Y equals zero. This is a 
pretty ludicrous null hypothesis. 

> On Sep 20, 2016, at 12:27 PM, T. Florian Jaeger <timegu...@gmail.com> wrote:
> 
> Guys,
> 
> just a quick note, in case it's not apparent to everyone (I had emailed this 
> earlier to Rachel): what happens in Rachel's model is simply that R defaults 
> to simple effects coding when a 'main' effect is removed while the 
> interaction is still included (note that this, I think, overrides whatever 
> contrasts you have specified for the factor you remove). That's actually a 
> very useful default. To me, the thing that was puzzling at first is the same 
> thing that Roger commented on: it should be just the same when you remove a 
> two-way or a three-way factor. indeed, when i tried to replicate Rachel's 
> problem, I did/do get the same (simple effects reparameterization) regardless 
> of how many levels the factor that I remove has.
> 
> Florian
> 
>> On Tue, Sep 20, 2016 at 2:41 PM Wednesday Bushong 
>> <wednesday.bush...@gmail.com> wrote:
>> Let me also say something w.r.t. coding because I think you also expressed 
>> doubt about what kind of coding scheme to use.
>> 
>> The crucial thing to remember is that when interpreting coefficients from R 
>> model summary outputs, a coefficient is interpreted as moving from a value 
>> of 0 to 1 on that particular variable, when the values of the other 
>> variables are set to 0.
>> 
>> In the case of dummy coding, then, the "main effect" of Listener is actually 
>> the difference in logodds going from the first level of listener to the 
>> second level of listener when the two SyntaxType dummy variables are at 0 -- 
>> that is, when SyntaxType is at the first level. So this is really just a 
>> pairwise comparison between two groups, and doesn't have anything to say 
>> about the average effect of Listener across the SyntaxType groups. In order 
>> to get the interpretation of Listener to be across the average of all 
>> SyntaxType groups, you would have to contrast code SyntaxType (b/c then 0 
>> will be the avg of all the levels). Similar interpretations in a fully 
>> dummy-coded model go for the other main effect terms (i.e., each SyntaxType 
>> effect is interpreted w.r.t the reference level of Listener) and the 
>> interaction terms (Listener:SyntaxType will be the effect of listener at the 
>> other SyntaxType levels; notice that this isn't even close to what we would 
>> normally conceptualize as an "interaction"! So be careful with coding!).
>> 
>> Of course, you can mix and match your coding schemes -- for instance, if you 
>> want to get the main effect of Listener at the avg. of SyntaxType but wanted 
>> pairwise comparisons of SyntaxType within one particular Listener group, you 
>> could contrast code SyntaxType and dummy code Listener appropriately -- but 
>> in general, the most common thing to do will be contrast coding all factors, 
>> which will give you the standard ANOVA output interpretation.
>> 
>> -Wed
>> 
>>> On Tue, Sep 20, 2016 at 1:58 PM Wednesday Bushong 
>>> <wednesday.bush...@gmail.com> wrote:
>>> Hi Rachel,
>>> 
>>> I think at times like this it's useful to look at exactly how R assigns 
>>> factors. When you add interactions, R does a lot of behind-the-scenes work 
>>> that isn't immediately apparent. One way to look into this in more detail 
>>> is this really nice function "model.matrix", which given a data frame and a 
>>> model formula, will show you all of the coding variables that are created 
>>> in order to fit the model and what their values are for each combination of 
>>> factors in the dataset. I've bolded this below.
>>> 
>>> # create data frame w/ each factor level combo
>>> d <- data.frame(Listener.f = rep(c("Listener1", "Listener2"), 3),
>>>     SyntaxType.f = c(rep("Syntax1", 2), rep("Syntax2", 2), rep("Syntax3", 
>>> 2)),
>>>     Target_E2_pref = rnorm(6))
>>> # make factor
>>> d$Listener.f <- factor(d$Listener.f)
>>> d$SyntaxType.f <- factor(d$SyntaxType.f)
>>> 
>>> # create model formulas corresponding to full and reduced model
>>> mod.formula <- formula(~ 1 + Listener.f * SyntaxType.f, d)
>>> mod.formula.reduced <- formula(~ 1 + SyntaxType.f + 
>>> Listener.f:SyntaxType.f, d)
>>> # get var assignments for all factor level combos
>>> mod.matrix <- model.matrix(mod.formula, d)
>>> mod.matrix.reduced <- model.matrix(mod.formula.reduced, d)
>>> 
>>> If you look at mod.matrix and mod.matrix.reduced, you'll see that they each 
>>> have the same dimensionality. Digging in further, we can see why this is. 
>>> Let's look at the column names of each model matrix:
>>> 
>>> colnames(mod.matrix)
>>> [2] "Listener.fListener2"
>>> [3] "SyntaxType.fSyntax2"
>>> [4] "SyntaxType.fSyntax3"
>>> [5] "Listener.fListener2:SyntaxType.fSyntax2"
>>> [6] "Listener.fListener2:SyntaxType.fSyntax3"
>>> 
>>> colnames(mod.matrix.reduced)
>>> [1] "(Intercept)"
>>> [2] "SyntaxType.fSyntax2"
>>> [3] "SyntaxType.fSyntax3"
>>> [4] "SyntaxType.fSyntax1:Listener.fListener2"
>>> [5] "SyntaxType.fSyntax2:Listener.fListener2"
>>> [6] "SyntaxType.fSyntax3:Listener.fListener2"
>>> 
>>> I've bolded the differences. Now don't ask me why, but the way that R 
>>> appears to handle subtracting a main effect from a model but keeping the 
>>> interaction is to add in another interaction dummy variable that makes the 
>>> model equivalent. (If you look at the values that each factor combo takes 
>>> on, you'll see that this particular dummy variable is 1 when Listener = 
>>> Listener2 and SyntaxType = Syntax1, and 0 otherwise).
>>> 
>>> The way to solve this is presented in Roger's paper he linked above (pg. 4 
>>> being the most relevant here). His particular example is for contrast 
>>> coding but you can make it work in the exact same way with dummy coding 
>>> (but make sure that dummy coding is what you really want to use given the 
>>> specific hypothesis you're testing!):
>>> 
>>> # make numeric versions of factors
>>> d$Listener.numeric <- sapply(d$Listener.f,function(i) 
>>> contr.treatment(2)[i,]) # can easily replace w/ whatever coding scheme you 
>>> want
>>> d$Syntax1.numeric <- sapply(d$SyntaxType.f,function(i) 
>>> contr.treatment(3)[i,])[1, ]
>>> d$Syntax2.numeric <- sapply(d$SyntaxType.f,function(i) 
>>> contr.treatment(3)[i,])[2, ]
>>> 
>>> # check model matrix
>>> mod.formula.new <- formula(~ 1 + Syntax1.numeric + Syntax2.numeric + 
>>> Listener.numeric:Syntax1.numeric + Listener.numeric:Syntax2.numeric, d)
>>> mod.matrix.new <- model.matrix(mod.formula.new, d)
>>> colnames(mod.matrix.new)
>>> 
>>> [1] "(Intercept)"                     
>>> [2] "Syntax1.numeric"
>>> [3] "Syntax2.numeric"                  
>>> [4] "Syntax1.numeric:Listener.numeric"
>>> [5] "Syntax2.numeric:Listener.numeric"
>>> 
>>> Now things are as they should be: no more mysterious extra dummy variable 
>>> containing information about the main effect of Listener! This last model 
>>> is what you should compare your original to get the significance of the 
>>> main effect of Listener. 
>>> 
>>> Hope this was helpful!
>>> 
>>> Best,
>>> Wednesday
>>> 
>>>> On Tue, Sep 20, 2016 at 12:26 PM Levy, Roger <rl...@ucsd.edu> wrote:
>>>> Hi Dan,
>>>> 
>>>> I’m having a bit of trouble figuring out exactly how your two comments 
>>>> comport with one another, but I think the crucial point here is that the 
>>>> procedure I outline in the paper is simply how to do exactly what is done 
>>>> in traditional ANOVA analyses.  In this approach, the expected effect size 
>>>> of, for example, ListenerType does not depend on the relative amounts of 
>>>> data in the various levels of SyntaxType (which is what I think you’re 
>>>> referring to by “balance of the levels”).
>>>> 
>>>> Your caveat regarding whether the main effect of a factor X1 necessarily 
>>>> has a sensible interpretation in the presence of the interaction between 
>>>> X1 and X2 is certainly appropriate.  In the beginning of the paper I have 
>>>> a few remarks on the caution that should be applied.  I do think that for 
>>>> factorial ANOVA analyses the main effect can often have a useful 
>>>> interpretation as the “across-the-board” effect that X1 has, regardless of 
>>>> the value of X2 (which once again is the traditional ANOVA interpretation 
>>>> of a main effect).
>>>> 
>>>> If my responses to your comments aren’t on target, I would be very glad 
>>>> for clarification!
>>>> 
>>>> Best
>>>> 
>>>> Roger
>>>> 
>>>> 
>>>>> On Sep 19, 2016, at 7:31 PM, Daniel Ezra Johnson 
>>>>> <danielezrajohn...@gmail.com> wrote:
>>>>> 
>>>>> If you follow this procedure, though, you'd be testing for the effect of 
>>>>> Listener when SyntaxType is "in the middle" (unweighted) of the three 
>>>>> levels. This quantity not only has no sensible interpretation, it also 
>>>>> depends on the balance of the levels of SyntaxType in the data.
>>>>> 
>>>>> If the effect of Listener is in the same direction for each level of 
>>>>> SyntaxType, such a test might be useful, but otherwise I don't think it 
>>>>> would be?
>>>>> 
>>>>> Dan 
>>>> 
>>>>> On Sep 19, 2016, at 8:00 PM, Daniel Ezra Johnson 
>>>>> <danielezrajohn...@gmail.com> wrote:
>>>>> 
>>>> 
>>>>> 
>>>>> 
>>>>>  it also depends on the balance of the levels of SyntaxType in the data.
>>>>> 
>>>> 
>>>>> Well not quite. The idea of testing the "middle level" is right, and 
>>>>> whatever this means doesn't change if the balance of data changes across 
>>>>> levels...
>>>>> 
>>>> 
>>>>> But you could have two data sets where the effects of listener for each 
>>>>> level of SyntaxType are the same (between the two data sets), but the 
>>>>> significance of this test changes…
>>>> 
>>>> 
>>>>> 
>>>>>> On Mon, Sep 19, 2016 at 2:45 PM, Levy, Roger <rl...@ucsd.edu> wrote:
>>>>>> Hi Rachel,
>>>>>> 
>>>>>> If your goal is to test the main effect of Listener in the presence of 
>>>>>> the Listener-SyntaxType interaction, as would typically be done in 
>>>>>> traditional ANOVA analyses, I recommend you read this brief paper I 
>>>>>> wrote a few years ago on how to do this:
>>>>>> 
>>>>>>   http://arxiv.org/abs/1405.2094
>>>>>> 
>>>>>> It is exactly targeted at this problem, and explains why you’re getting 
>>>>>> the behavior you report due to differences in how R treats factors 
>>>>>> versus numeric variables in formulae.  (Setting the contrasts on the 
>>>>>> factor has no impact.)
>>>>>> 
>>>>>> I have no explanation for your reported behavior of why you don’t get 
>>>>>> this problem when you test for the main effect of SyntaxType; if you 
>>>>>> give further details, we might be able to help further!
>>>>>> 
>>>>>> Best
>>>>>> 
>>>>>> Roger
>>>>>> 
>>>>>>> On Sep 18, 2016, at 5:57 PM, Rachel Ostrand <rostr...@cogsci.ucsd.edu> 
>>>>>>> wrote:
>>>>>>> 
>>>>>>> Hi everyone,
>>>>>>> 
>>>>>>> I'm having trouble with some 2-factor glmer models that I'm trying to 
>>>>>>> run, such that the model with one of the main effects removed is coming 
>>>>>>> out identical to the full model. Some colleagues suggested that this 
>>>>>>> might be due to the coding of my factors, specifically because I have a 
>>>>>>> factor that has 3 levels, and that one needs to be treated differently, 
>>>>>>> but I'm not sure how - or why - to do that.
>>>>>>> 
>>>>>>> Brief summary of my data:
>>>>>>> -My DV (called Target_E2_pref) is a binary categorical variable.
>>>>>>> -There are two categorical IVs: Listener (2 levels) and SyntaxType (3 
>>>>>>> levels).
>>>>>>> -Listener varies by both subject and item (i.e., picture); SyntaxType 
>>>>>>> only varies by subject.
>>>>>>> 
>>>>>>> When I dummy coded my variables using contr.treatment(), the model with 
>>>>>>> the main effect of Listener removed from the fixed effects comes out 
>>>>>>> identical to the full model:
>>>>>>> 
>>>>>>> SoleTrain = read.table(paste(path, "SoleTrain.dat", sep=""), header=T)
>>>>>>> SoleTrain$Listener.f = factor(SoleTrain$Listener, labels=c("E1", "E2"))
>>>>>>> contrasts(SoleTrain$Listener.f) = contr.treatment(2)
>>>>>>> SoleTrain$SyntaxType.f = factor(SoleTrain$SyntaxType, 
>>>>>>> labels=c("Transitive", "Locative", "Dative"))
>>>>>>> contrasts(SoleTrain$SyntaxType.f) = contr.treatment(3)
>>>>>>> 
>>>>>>> # Create full model:
>>>>>>> SoleTrain.full<- glmer(Target_E2_pref ~ Listener.f*SyntaxType.f + (1 + 
>>>>>>> Listener.f*SyntaxType.f|Subject) + (1 + Listener.f|Picture), data = 
>>>>>>> SoleTrain, family = binomial, verbose=T, 
>>>>>>> control=glmerControl(optCtrl=list(maxfun=20000)))
>>>>>>> 
>>>>>>> # Create model with main effect of Listener removed:
>>>>>>> SoleTrain.noListener<- glmer(Target_E2_pref ~ SyntaxType.f + 
>>>>>>> Listener.f:SyntaxType.f + (1 + Listener.f*SyntaxType.f|Subject) + (1 + 
>>>>>>> Listener.f|Picture), data = SoleTrain, family = binomial, verbose=T, 
>>>>>>> control=glmerControl(optCtrl=list(maxfun=20000)))
>>>>>>> 
>>>>>>> > anova(SoleTrain.full, SoleTrain.noListener)
>>>>>>> Data: SoleTrain
>>>>>>> Models:
>>>>>>> SoleTrain.full: Target_E2_pref ~ Listener.f * SyntaxType.f + (1 + 
>>>>>>> Listener.f * SyntaxType.f | Subject) + (1 + Listener.f | Picture)
>>>>>>> SoleTrain.noListener: Target_E2_pref ~ SyntaxType.f + 
>>>>>>> Listener.f:SyntaxType.f + (1 + Listener.f * SyntaxType.f | Subject) + 
>>>>>>> (1 + Listener.f | Picture)
>>>>>>>                      Df    AIC    BIC  logLik deviance Chisq Chi Df 
>>>>>>> Pr(>Chisq)
>>>>>>> SoleTrain.full       30 2732.5 2908.5 -1336.2   2672.5                  
>>>>>>>       
>>>>>>> SoleTrain.noListener 30 2732.5 2908.5 -1336.2   2672.5     0      0     
>>>>>>>      1
>>>>>>> 
>>>>>>> However, I don't have this problem when I test for the main effect of 
>>>>>>> SyntaxType, and remove the SyntaxType.f factor from the fixed effects. 
>>>>>>> (That is, this produces a different model than the full model.)
>>>>>>> 
>>>>>>> Someone suggested that Helmert coding was better for factors with more 
>>>>>>> than two levels, so I tried running the same models except with Helmert 
>>>>>>> coding [contrasts(SoleTrain$SyntaxType.f) = contr.helmert(3)], but the 
>>>>>>> models come out identical to the way they do with dummy coding. So why 
>>>>>>> does the model with the main effect of Listener removed the same as the 
>>>>>>> model with the main effect of Listener retained?
>>>>>>> 
>>>>>>> Any suggestions as to what I'm doing wrong?
>>>>>>> 
>>>>>>> Thanks!
>>>>>>> Rachel
>>>>>> 
>>>>> 

Reply via email to