I'm trying to build a mixed logit model using lmer, and I have some questions about poly() and the use of quadratic terms in general. My understanding is that, by default, poly() creates orthogonal polynomials, so the coefficients are not easily interpretable. On the other hand, using m ~ poly(x, raw=T) should be equivalent to m ~ x + xsq, where xsq is precomputed as x^2. I have verified the second statement by building one model using raw poly() and another one by precomputing all quadratic terms, and the coefficients are indeed virtually identical. Now I have a couple of questions:
1. When I try to build a model using raw=F, I get the following error: > m22 <- lmer(is_err ~ poly(pmean_sex,2) +(1|speaker)+(1|ref), > x=T,y=T,family="binomial") Error in x * raw^2 : non-conformable arrays but > m22 <- lmer(is_err ~ poly(pmean_sex,2,raw=T) +(1|speaker)+(1|ref), > x=T,y=T,family="binomial") works fine. Normally my model has far more predictors, but this is the only one that seems to cause the problem. This doesn't seem to be a problem with lmer specifically, since it's reproducible using lrm and lm as well. Does anyone know what this means? I can't seem to find an answer in R help archives that makes any sense. More generally, should I care about trying to fix it? That is, is there a good reason to prefer orthogonal polynomials rather than raw ones, aside from reducing collinearity slightly? Since I would like to be able to interpret the coefficients (i.e., determine the relative importance of each variable as a predictor of error rates), I would tend towards using raw polynomials anyway. 2. I always hear people say that if you have both a linear and quadratic term for a particular predictor, you shouldn't get rid of the linear term (even if it shows up as not significant) while retaining the quadratic term. In fact, if you were using poly(), it wouldn't even be possible to do that. But suppose you instead precompute all the quadratic terms and treat them as separate variables. It seems to me that retaining a quadratic term for variable x while deleting the linear term is essentially just performing a transformation on the variable, no different from taking the log of x as the predictor, which we do all the time for linguistic variables. So is there really any reason not to do it? --AuH2O -- The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336. _______________________________________________ R-lang mailing list [email protected] http://pidgin.ucsd.edu/mailman/listinfo/r-lang
