[R] lme model specification problem (Error in MEEM...)
Dear all, In lme, models in which a factor is fully contained in another lead to an error. This is not the case when using lm/aov. I understand that these factors are aliased, but believe that such models make sense when the factors are fitted sequentially. For example, I sometimes fit a factor first as linear term (continuous variable with discrete levels, e.g. 1,2,4,6), and then as factor, with the latter then accounting for the deviation from linearity. If x contains the values 1,2,4,6, the model formula then would be y ~ x + as.factor(x) A complete example is here: library(nlme) d - data.frame(x=rep(c(1,2,4,6),each=2),subj=factor(rep(1:16,each=2))) d$y - rnorm(32) + rnorm(length(levels(d$subj)))[d$subj] anova(lme(y~x,random=~1|subj,data=d)) ## works anova(lme(y~as.factor(x),random=~1|subj,data=d))## works anova(lme(y~x+as.factor(x),random=~1|subj,data=d)) ## fails: # Error in MEEM(object, conLin, control$niterEM) : # Singularity in backsolve at level 0, block 1 summary(aov(y~x+as.factor(x)+Error(subj),data=d)) ## works: # Error: subj # Df Sum Sq Mean Sq F value Pr(F) # x 1 8.434 8.434 4.780 0.0493 * # as.factor(x) 2 10.459 5.230 2.964 0.0900 . # Residuals12 21.176 1.765 # ... rest of output removed ... I understand I can to some extent work around this limitation by modifying the contrast encoding, but I then still don't get an overall test for as.factor(x) (in the example above, a test for deviation from linearity). d$xfac - factor(d$x) contrasts(d$xfac)-c(1,2,4,6) summary(lme(y~xfac,random=~1|subj,data=d)) Is there a way to work around this limitation of lme? Or do I mis-specify the model? Thanks for your advice. Pascal __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme model specification problem (Error in MEEM...)
Pascal A. Niklaus pascal.niklaus at ieu.uzh.ch writes: In lme, models in which a factor is fully contained in another lead to an error. This is not the case when using lm/aov. I understand that these factors are aliased, but believe that such models make sense when the factors are fitted sequentially. For example, I sometimes fit a factor first as linear term (continuous variable with discrete levels, e.g. 1,2,4,6), and then as factor, with the latter then accounting for the deviation from linearity. If x contains the values 1,2,4,6, the model formula then would be y ~ x + as.factor(x) A complete example is here: library(nlme) d - data.frame(x=rep(c(1,2,4,6),each=2),subj=factor(rep(1:16,each=2))) d$y - rnorm(32) + rnorm(length(levels(d$subj)))[d$subj] anova(lme(y~x,random=~1|subj,data=d)) ## works anova(lme(y~as.factor(x),random=~1|subj,data=d))## works anova(lme(y~x+as.factor(x),random=~1|subj,data=d)) ## fails: # Error in MEEM(object, conLin, control$niterEM) : # Singularity in backsolve at level 0, block 1 summary(aov(y~x+as.factor(x)+Error(subj),data=d)) ## works: # Error: subj # Df Sum Sq Mean Sq F value Pr(F) # x 1 8.434 8.434 4.780 0.0493 * # as.factor(x) 2 10.459 5.230 2.964 0.0900 . # Residuals12 21.176 1.765 # ... rest of output removed ... I understand I can to some extent work around this limitation by modifying the contrast encoding, but I then still don't get an overall test for as.factor(x) (in the example above, a test for deviation from linearity). d$xfac - factor(d$x) contrasts(d$xfac)-c(1,2,4,6) summary(lme(y~xfac,random=~1|subj,data=d)) Is there a way to work around this limitation of lme? Or do I mis-specify the model? Thanks for your advice. This question would probably be more appropriate for the r-sig-mixed-mod...@r-project.org mailing list. A short answer (if you re-post to r-sig-mixed-models I might answer at greater length) would be that you should be able to quantify the difference between y~x and y~factor(x) by an anova comparison of the two *models*, i.e. anova(m1,m2) Another thing to consider would be setting using orthogonal polynomial contrasts for factor(x), where summary() would give you a result for the constant, linear, quadratic ... terms Ben Bolker __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] lme model specification
Dear all this is a question of model specification in lme which I'd for which I'd greatly appreciate some guidance. Suppose I have data in long format gene treatment rep Y 11 1 4.32 11 2 4.67 11 3 5.09 .. .. .. .. .. .. 14 1 3.67 14 2 4.64 14 3 4.87 .. .. .. .. .. .. 2000 1 1 5.12 2000 1 2 2.87 2000 1 3 7.23 .. .. .. .. .. .. 2000 4 1 2.48 2000 4 2 3.93 2000 4 3 5.17 that is, I have data Y_{gtr} for g (gene) =1,...,2000t (treatment) = 1,...,4 andr (replicate) = 1,...,3 I would like to fit the following linear mixed model using lme Y_{gtr} = \mu_{g} + W_{gt} + Z_{gtr} where the \mu_{g}'s are fixed gene effects, W_{gt} ~ N(0, \sigma^{2}) gene-treatment interactions, and residual errors Z_{gtr} ~ N(0,\tau^{2}). (Yes, I know I'm specifying an interaction between gene and treatment without specifying a treatment main effect ! - there is good reason for this) I know that specifying model.1 - lme(Y ~ -1 + factor(gene), data=data, random= ~1|gene/treatment) fits Y_{gtr} = \mu_{g} + U_{g} + W_{gt} + Z_{gtr} with \mu_{g}, W_{gt} and Z_{gtr} as previous and U_{g} ~ N(0,\gamma^{2}), but I do NOT want to specify a random gene effect. I have scoured Bates and Pinheiro without coming across a parallel example. Any help would be greatly appreciated Best Gerwyn Green School of Health and Medicine Lancaster Uinversity __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] lme model specification
On Sun, Nov 15, 2009 at 7:19 AM, Green, Gerwyn (greeng6) g.gre...@lancaster.ac.uk wrote: Dear all this is a question of model specification in lme which I'd for which I'd greatly appreciate some guidance. Suppose I have data in long format gene treatment rep Y 1 1 1 4.32 1 1 2 4.67 1 1 3 5.09 . . . . . . . . . . . . 1 4 1 3.67 1 4 2 4.64 1 4 3 4.87 . . . . . . . . . . . . 2000 1 1 5.12 2000 1 2 2.87 2000 1 3 7.23 . . . . . . . . . . . . 2000 4 1 2.48 2000 4 2 3.93 2000 4 3 5.17 that is, I have data Y_{gtr} for g (gene) =1,...,2000 t (treatment) = 1,...,4 and r (replicate) = 1,...,3 I would like to fit the following linear mixed model using lme Y_{gtr} = \mu_{g} + W_{gt} + Z_{gtr} where the \mu_{g}'s are fixed gene effects, W_{gt} ~ N(0, \sigma^{2}) gene-treatment interactions, and residual errors Z_{gtr} ~ N(0,\tau^{2}). (Yes, I know I'm specifying an interaction between gene and treatment without specifying a treatment main effect ! - there is good reason for this) You are going to end up estimating 2000 fixed-effects parameters for gene, which will take up a lot of memory (one copy of the model matrix for the fixed-effects will be 24000 by 2000 double precision numbers or about 400 MB). You might be able to fit that in lme as lme(Y ~ -1 + factor(gene), data = data, random = ~ 1|gene:treatment) but it will probably take a long time or run out of memory. There is an alternative which is to use the development branch of the lme4 package that allows for a sparse model matrix for the fixed-effects parameters. Or ask yourself if you really need to model the genes as fixed effects instead of random effects. We have seen situations where users do not want the shrinkage involved with random effects but it is rare. If you want to follow up on the development branch (for which binary packages are not currently available, i.e. you need to compile it yourself) then we can correspond off-list. I know that specifying model.1 - lme(Y ~ -1 + factor(gene), data=data, random= ~1|gene/treatment) fits Y_{gtr} = \mu_{g} + U_{g} + W_{gt} + Z_{gtr} with \mu_{g}, W_{gt} and Z_{gtr} as previous and U_{g} ~ N(0,\gamma^{2}), but I do NOT want to specify a random gene effect. I have scoured Bates and Pinheiro without coming across a parallel example. Any help would be greatly appreciated Best Gerwyn Green School of Health and Medicine Lancaster Uinversity __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.