[R-lang] centering vs. applying contrast sum coding

mnc Thu, 12 Nov 2015 03:22:40 -0800

  

Hello everyone,


I posted a question about this about a month ago
but I have received no answers. I am now re-posting a modified version
of it, hoping that someone out there will help me clear up my doubts.


So far in my mixed regression models I have 'centered' any 2-level
predictors with the 'scale as numeric' command and interpreted the
output as giving Anova-style effects and interactions. 

Recently I have
had to run a model that has one 2-level predictor and one 3-level
predictor. I know that using 'scale as numeric' on the 3-level predictor
is not the right thing to do, so I have only centered the 2-level
predictor and left the 3-level predictor as it is (I assume this means
this predictor receives the default treatment coding in the model). But
with this kind of set up I have had great convergence problems -
practically the model converges only with intercept-only random effects
(1|subj and 1|item). When it does not converge, I get messages such as
"Model is nearly unidentifiable: large eigenvalue ratio- Rescale
variables?" (which I do not understand, - how exactly do I re-scale the
variables). Furthermore, this model is not really what I want as it does
not give me the main effect of the 2-level predictor (averaged across
the 3-levels of the 3-level predictor). 

After reading up a bit, I have
investigated alternatives, and I have found that applying sum contrast
coding to the two predictors (contr.sum(2) to the 2-level predictor and
contr.sum(3) for the 3-level predictor) the full model converges with no
problems; in addition I get what I want - the main effect of the 2-level
predictor.  

But I would like to know the experts' opinion about this
because I have read that when the data is not fully balanced, i.e. when
we do not have the same number of cases per condition (for example when
there is missing data), we should center our predictors (and so far I
have addressed the imbalance by centering/scaling all predictors with 2
levels with "scale as numeric" and using the centered variables as
predictors). 

So my question is: Does sum contrast coding address any
imbalance in my data (which is not perfectly balanced because of missing
data in some conditions)? If it does not, do you have any suggestions as
to how I could solve this problem, bearing in mind that I want a model
that gives me: the main effect of the 2-level predictor (in the ANOVA
sense), whether each of the levels of the 3-level predictor differs from
the grand mean (or at least whether two of the levels do, given that one
is only the baseline) and any interaction between the two predictors (in
the ANOVA sense)?  

Thanks so much for your help 

Maria Nella
Carminati

[R-lang] centering vs. applying contrast sum coding

Reply via email to