[R] anova leads to an error
Dear R-list, the following code had been running well over the last months: exam - matrix(rnorm(100,0,1), 10, 10) gg - factor(c(rep(A, 5), rep(B, 5))) mlmfit - lm(exam ~ 1); mlmfitG - lm(exam ~ gg) result - anova(mlmfitG, mlmfit, X=~0, M=~1) Until, all of a sudden the following error occured: Fehler in apply(abs(sapply(deltassd, function(X) diag((T %*% X %*% t(T), : dim(X) must have a positive length I have not kept track of the changes in my R-version, so it might have to do with that. Now it is: R version 2.9.0 (2009-04-17). Does anybody know more about this error? I would help me a lot! Thank you very much! Nils __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Anova and unbalanced designs
Dear John, thank you for your answer. You are right, I also would not have expected a divergent result. I have double-checked it again. No, I got type-III tests. When I use type II, I get the same results in SPSS as in 'Anova' (using also type-II tests). My guess was that the somehow weighted means SPSS shows could be responsible for this difference. Or that using 'Anova' would not be correct for unequal group n's, which was not the case I think. Do you have any further ideas? Thank you! Nils John Fox schrieb: Dear Nils, This is a pretty simple design, and I wouldn't have thought that there was much room for getting different results. More generally, but not here (since there's only one between-subject factor), one shouldn't use contr.treatment() with type-III tests, as you did. Is it possible that you got type-II tests from SPSS: -- snip -- summary(Anova(betweenanova, idata=with, idesign= ~within, type = II )) Type II Repeated Measures MANOVA Tests: -- Term: between Response transformation matrix: (Intercept) w1 1 w2 1 Sum of squares and products for the hypothesis: (Intercept) (Intercept) 9.6 Sum of squares and products for error: (Intercept) (Intercept) 18 Multivariate Tests: between Df test stat approx F num Df den Df Pr(F) Pillai1 0.347826 4.27 1 8 0.072726 . Wilks 1 0.652174 4.27 1 8 0.072726 . Hotelling-Lawley 1 0.53 4.27 1 8 0.072726 . Roy 1 0.53 4.27 1 8 0.072726 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 -- Term: within Response transformation matrix: within1 w1 1 w2 -1 Sum of squares and products for the hypothesis: within1 within1 0.4 Sum of squares and products for error: within1 within1 21.3 Multivariate Tests: within Df test stat approx F num Df den Df Pr(F) Pillai1 0.0184049 0.150 1 8 0.70864 Wilks 1 0.9815951 0.150 1 8 0.70864 Hotelling-Lawley 1 0.0187500 0.150 1 8 0.70864 Roy 1 0.0187500 0.150 1 8 0.70864 -- Term: between:within Response transformation matrix: within1 w1 1 w2 -1 Sum of squares and products for the hypothesis: within1 within1 4.27 Sum of squares and products for error: within1 within1 21.3 Multivariate Tests: between:within Df test stat approx F num Df den Df Pr(F) Pillai1 0.167 1.600 1 8 0.24150 Wilks 1 0.833 1.600 1 8 0.24150 Hotelling-Lawley 1 0.200 1.600 1 8 0.24150 Roy 1 0.200 1.600 1 8 0.24150 Univariate Type II Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den Df F Pr(F) between 4.8000 1 9. 8 4.2667 0.07273 . within 0.2000 1 10.6667 8 0.1500 0.70864 between:within 2.1333 1 10.6667 8 1.6000 0.24150 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 -- snip -- I hope this helps, John -- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Skotara Sent: January-23-09 12:16 PM To: r-help@r-project.org Subject: [R] Anova and unbalanced designs Dear R-list! My question is related to an Anova including within and between subject factors and unequal group sizes. Here is a minimal example of what I did: library(car) within1 - c(1,2,3,4,5,6,4,5,3,2); within2 - c(3,4,3,4,3,4,3,4,5,4) values - data.frame(w1 = within1, w2 = within2) values - as.matrix(values) between - factor(c(rep(1,4), rep(2,6))) betweenanova - lm(values ~ between) with - expand.grid(within = factor(1:2)) withinanova - Anova(betweenanova, idata=with, idesign= ~as.factor(within), type = III ) I do not know if this is the appropriate method to deal with unbalanced designs. I observed, that SPSS calculates everything identically except the main effect of the within factor, here, the SSQ and F-value are very different If selecting the option show means, the means for the levels of the within factor in SPSS are the same as: mean(c(mean(values$w1[1:4]),mean(values$w1[5:10]))) and mean(c(mean(values$w2[1:4]),mean(values$w2[5:10]))). In other words, they are calculated as if both groups would have the same size. I wonder if this is a good solution and if so, how could I do the same thing in R? However, I think if this is treated in SPSS
Re: [R] Anova and unbalanced designs
, these agree with Anova(): --- snip Type III Repeated Measures MANOVA Tests: Pillai test statistic Df test stat approx F num Df den DfPr(F) (Intercept) 1 0.963 209.067 1 8 5.121e-07 *** between 1 0.3484.267 1 8 0.07273 . within 1 0.0480.400 1 8 0.54474 between:within 1 0.1671.600 1 8 0.24150 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 Univariate Type III Repeated-Measures ANOVA Assuming Sphericity SS num Df Error SS den DfFPr(F) (Intercept)235.200 19.000 8 209.0667 5.121e-07 *** between 4.800 19.000 8 4.2667 0.07273 . within 0.533 1 10.667 8 0.4000 0.54474 between:within 2.133 1 10.667 8 1.6000 0.24150 --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 --- snip So, unless Anova() and SAS are making the same error, I guess SPSS is doing something strange (or perhaps you didn't do what you intended in SPSS). As I said before, this problem is so simple, that I find it hard to understand where there's room for error, but I wanted to check against SAS to test my sanity (a procedure that will likely get a rise out of some list members). Maybe you should send a message to the SPSS help list. Regards, John -- John Fox, Professor Department of Sociology McMaster University Hamilton, Ontario, Canada web: socserv.mcmaster.ca/jfox -Original Message- From: r-help-boun...@r-project.org [mailto:r-help-boun...@r-project.org] On Behalf Of Skotara Sent: January-24-09 6:30 AM To: John Fox Cc: r-help@r-project.org Subject: Re: [R] Anova and unbalanced designs Dear John, thank you for your answer. You are right, I also would not have expected a divergent result. I have double-checked it again. No, I got type-III tests. When I use type II, I get the same results in SPSS as in 'Anova' (using also type-II tests). My guess was that the somehow weighted means SPSS shows could be responsible for this difference. Or that using 'Anova' would not be correct for unequal group n's, which was not the case I think. Do you have any further ideas? Thank you! Nils John Fox schrieb: Dear Nils, This is a pretty simple design, and I wouldn't have thought that there was much room for getting different results. More generally, but not here (since there's only one between-subject factor), one shouldn't use contr.treatment() with type-III tests, as you did. Is it possible that you got type-II tests from SPSS: -- snip -- summary(Anova(betweenanova, idata=with, idesign= ~within, type = II )) Type II Repeated Measures MANOVA Tests: -- Term: between Response transformation matrix: (Intercept) w1 1 w2 1 Sum of squares and products for the hypothesis: (Intercept) (Intercept) 9.6 Sum of squares and products for error: (Intercept) (Intercept) 18 Multivariate Tests: between Df test stat approx F num Df den Df Pr(F) Pillai1 0.347826 4.27 1 8 0.072726 . Wilks 1 0.652174 4.27 1 8 0.072726 . Hotelling-Lawley 1 0.53 4.27 1 8 0.072726 . Roy 1 0.53 4.27 1 8 0.072726 . --- Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1 -- Term: within Response transformation matrix: within1 w1 1 w2 -1 Sum of squares and products for the hypothesis: within1 within1 0.4 Sum of squares and products for error: within1 within1 21.3 Multivariate Tests: within Df test stat approx F num Df den Df Pr(F) Pillai1 0.0184049 0.150 1 8 0.70864 Wilks 1 0.9815951 0.150 1 8 0.70864 Hotelling-Lawley 1 0.0187500 0.150 1 8 0.70864 Roy 1 0.0187500 0.150 1 8 0.70864 -- Term: between:within Response transformation matrix: within1 w1 1 w2 -1 Sum of squares and products for the hypothesis: within1 within1 4.27 Sum of squares and products for error: within1 within1 21.3 Multivariate Tests: between:within Df test stat approx F num Df den Df Pr(F) Pillai1 0.167 1.600 1 8 0.24150 Wilks 1 0.833 1.600 1
[R] Anova and unbalanced designs
Dear R-list! My question is related to an Anova including within and between subject factors and unequal group sizes. Here is a minimal example of what I did: library(car) within1 - c(1,2,3,4,5,6,4,5,3,2); within2 - c(3,4,3,4,3,4,3,4,5,4) values - data.frame(w1 = within1, w2 = within2) values - as.matrix(values) between - factor(c(rep(1,4), rep(2,6))) betweenanova - lm(values ~ between) with - expand.grid(within = factor(1:2)) withinanova - Anova(betweenanova, idata=with, idesign= ~as.factor(within), type = III ) I do not know if this is the appropriate method to deal with unbalanced designs. I observed, that SPSS calculates everything identically except the main effect of the within factor, here, the SSQ and F-value are very different If selecting the option show means, the means for the levels of the within factor in SPSS are the same as: mean(c(mean(values$w1[1:4]),mean(values$w1[5:10]))) and mean(c(mean(values$w2[1:4]),mean(values$w2[5:10]))). In other words, they are calculated as if both groups would have the same size. I wonder if this is a good solution and if so, how could I do the same thing in R? However, I think if this is treated in SPSS as if the group sizes are identical, then why not the interaction, which yields to the same result as using Anova()? Many thanks in advance for your time and help! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] assign a list using expression?
Dear R-users, I would like to assign elements to a list in the following manner: mylist - list(a = a, b = b, c = c) To do this I tried myexpr - expression(a = a, b = b, c = c) mylist - list( eval(myexpr) ) It ends up by overwriting a when b is assigned and b when c is assigned. Additionally the element of the list does not have a name. Could you tell me why this is the case? Thank you very much in advance! Best regards, Nils __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] assign a list using expression?
Thank you Patrick and Gabor! Sorry, I think I have not explainend it well. The purpose is as follows: names - letters[1:3] values - data.frame(a = 1:3, b = 4:6, c = 7:9) With more complicated objects similar to 'names' and 'values' I wrote the following line to assign the elements of the list: mycommand - parse(text = paste(names, = values[\, names, \], sep=) ) However, list(eval(mycommand)) does not do what I want. whereas list(a = values[a], b = values[b], c = values[c]) does. I can not tell why... I try to understand, what expression and eval do. I know that many times there are other ways to achieve the same goal. So here, too. But I think there should be a reason why it does not work that way. Best regards! Nils __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get Greenhouse-Geisser epsilons from anova?
Dear John, thank you for the kind offer! Sorry, I just made a mistake anywhere I can not trace back, now it works as you described it. Thank you again! Dear Peter, thank you for the information, I did not know about the quotation marks. It indeed works using G-G Pr! The SPSS and R output for the epsilons differ exactly by N/(N-(k-1)). So I think it must be the mentioned bug. I wish you merry christmas! __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get Greenhouse-Geisser epsilons from anova?
Dear John and Peter, thank you both very much for your help! Everything works fine now! John, Anova also works very fine. Thank you very much! However, if I had more than 2 levels for the between factor the same thing as mentioned occured. The degrees of freedom showed that Anova calculated it as if all subjects came from the same group, for example for main effect A the dfs are 1 and 35. Since I can get those values using anova that causes no problem. I saw that the x$G to get the greenhouse-geisser epsilon do work for: x- anova(mlmfitD, X=~C+B, M=~A+C+B, test = Spherical) but does not work for y$G: y - anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = dd,test=Spherical) Finally, the Greenhouse-Geisser epsilons are identical using both methods and to the SPSS output. The Huynh-Feldt are not the same as them of SPSS. I will use GG instead. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get Greenhouse-Geisser epsilons from anova?
Thank you for your help! Sorry, for bothering you again.. I still have trouble combining within and between subject factors. Interactions of within factors and D having only 2 levels work well. How can I get the main effect of D? I have tried anova(mlmfitD, mlmfit). With D having 3 levels I would expect the dfs to be 2 and 33. However, the output states 84,24?? As long as the between factor has only 2 levels the between/within interactions fit well with SPSS, but if D has 3 levels, the mismatch is immense. If I calculate the within effects with myma having not 12 subjects from one group but for example 24 from 2 groups, the output treats it as if all subjects came from the same group, for example for main effect A the dfs are 1 and 35. SPSS puts out 1 and 33 which is what I would have expected.. .. Peter Dalgaard schrieb: Nils Skotara wrote: Thank you, this helped me a lot! All within effects and interactions work well! Sorry, but I still could not get how to include the between factor.. If I include D with 2 levels, then myma is 24 by 28. (another 12 by 28 for the second group of subjects.) mlmfitD - lm(myma~D) is no problem, but whatever I tried afterwards did not seem logical to me. I am afraid I do not understand how to include the between factor. I cannot include ~D into M or X because it has length 24 whereas the other factors have 28... Just do the same as before, but comparing mlmfitD to mlmfit: anova(mlmfitD, mlmfit, X=~A+B, M=~A+B+C) # or anova(mlmfitD, mlmfit, X=~1, M=~C), as long as things are balanced gives the D:C interaction test (by testing whether the C contrasts depend on D). The four-factor interaction is anova(mlmfitD, mlmfit, X=~(A+B+C)^2, M=~A*B*C) Zitat von Peter Dalgaard [EMAIL PROTECTED]: Skotara wrote: Dear Mr. Daalgard. thank you very much for your reply, it helped me to progress a bit. The following works fine: dd - expand.grid(C = 1:7, B= c(r, l), A= c(c, f)) myma - as.matrix(myma) #myma is a 12 by 28 list mlmfit - lm(myma~1) mlmfit0 - update(mlmfit, ~0) anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = dd, test=Spherical), which tests the main effect of A. anova(mlmfit, mlmfit0, X= ~A+C, M = ~A+C+B, idata = dd, test=Spherical), which tests the main effect of B. However, I can not figure out how this works for the other effects. If I try: anova(mlmfit, mlmfit0, X= ~A+B, M = ~A+C+B, idata = dd, test=Spherical) I get: Fehler in function (object, ..., test = c(Pillai, Wilks, Hotelling-Lawley, : residuals have rank 1 4 dd$C is not a factor with that construction. It works for me after dd$C - factor(dd$C) (The other message is nasty, though. It's slightly different in R-patched: anova(mlmfit, mlmfit0, X= ~A+B, M = ~A+C+B, idata = dd, test=Spherical) Error in solve.default(Psi, B) : system is computationally singular: reciprocal condition number = 2.17955e-34 but it shouldn't happen... Looks like it is a failure of the internal Thin.row function. Ick! ) I also don't know how I can calculate the various interactions.. My read is I should change the second argument mlmfit0, too, but I can't figure out how... The within interactions should be straightforward, e.g. M=~A*B*C X=~A*B*C-A:B:C etc. The within/between interactions are otained from the similar tests of the between factor(s) e.g. mlmfitD - lm(myma~D) and then anova(mlmfitD, mlmfit,) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get Greenhouse-Geisser epsilons from anova?
Thank you, this helped me a lot! All within effects and interactions work well! Sorry, but I still could not get how to include the between factor.. If I include D with 2 levels, then myma is 24 by 28. (another 12 by 28 for the second group of subjects.) mlmfitD - lm(myma~D) is no problem, but whatever I tried afterwards did not seem logical to me. I am afraid I do not understand how to include the between factor. I cannot include ~D into M or X because it has length 24 whereas the other factors have 28... Zitat von Peter Dalgaard [EMAIL PROTECTED]: Skotara wrote: Dear Mr. Daalgard. thank you very much for your reply, it helped me to progress a bit. The following works fine: dd - expand.grid(C = 1:7, B= c(r, l), A= c(c, f)) myma - as.matrix(myma) #myma is a 12 by 28 list mlmfit - lm(myma~1) mlmfit0 - update(mlmfit, ~0) anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = dd, test=Spherical), which tests the main effect of A. anova(mlmfit, mlmfit0, X= ~A+C, M = ~A+C+B, idata = dd, test=Spherical), which tests the main effect of B. However, I can not figure out how this works for the other effects. If I try: anova(mlmfit, mlmfit0, X= ~A+B, M = ~A+C+B, idata = dd, test=Spherical) I get: Fehler in function (object, ..., test = c(Pillai, Wilks, Hotelling-Lawley, : residuals have rank 1 4 dd$C is not a factor with that construction. It works for me after dd$C - factor(dd$C) (The other message is nasty, though. It's slightly different in R-patched: anova(mlmfit, mlmfit0, X= ~A+B, M = ~A+C+B, idata = dd, test=Spherical) Error in solve.default(Psi, B) : system is computationally singular: reciprocal condition number = 2.17955e-34 but it shouldn't happen... Looks like it is a failure of the internal Thin.row function. Ick! ) I also don't know how I can calculate the various interactions.. My read is I should change the second argument mlmfit0, too, but I can't figure out how... The within interactions should be straightforward, e.g. M=~A*B*C X=~A*B*C-A:B:C etc. The within/between interactions are otained from the similar tests of the between factor(s) e.g. mlmfitD - lm(myma~D) and then anova(mlmfitD, mlmfit,) __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] How to get Greenhouse-Geisser epsilons from anova?
Dear Mr. Daalgard. thank you very much for your reply, it helped me to progress a bit. The following works fine: dd - expand.grid(C = 1:7, B= c(r, l), A= c(c, f)) myma - as.matrix(myma) #myma is a 12 by 28 list mlmfit - lm(myma~1) mlmfit0 - update(mlmfit, ~0) anova(mlmfit, mlmfit0, X= ~C+B, M = ~A+C+B, idata = dd, test=Spherical), which tests the main effect of A. anova(mlmfit, mlmfit0, X= ~A+C, M = ~A+C+B, idata = dd, test=Spherical), which tests the main effect of B. However, I can not figure out how this works for the other effects. If I try: anova(mlmfit, mlmfit0, X= ~A+B, M = ~A+C+B, idata = dd, test=Spherical) I get: Fehler in function (object, ..., test = c(Pillai, Wilks, Hotelling-Lawley, : residuals have rank 1 4 I also don't know how I can calculate the various interactions.. My read is I should change the second argument mlmfit0, too, but I can't figure out how... Do you know what to do? Thank you very much! Peter Dalgaard schrieb: Skotara wrote: Dear all, I apologize for my basic question. I try to calculate an anova for repeated measurements with 3 factors (A,B,C) having 2, 2, and 7 levels. or with an additional fourth between subjects factor D. Everything works fine using aov(val ~ A*B*C + Error(subject/ (A*B*C) ) ) or aov(val ~ (D*A*B*C) + Error(subject/(A*B*C)) + D ) val, A, B, C, D and subject are columns in a data.frame. How can I get the estimated Greenhouse-Geisser and Huynh-Feldt epsilons? I know Peter Dalgaard described it in R-News Vol. 7/2, October 2007. However, unfortunately I am not able to apply that using my data... Why? It is supposed to work. You just need to work out the X and M specification for the relevant error strata and set test=Spherical for anova.mlm, or work out the T contrast matrix explicitly if that suits your temper better. Furthermore, I am still confused of how SPSS calculates the epsilons since it is mentioned that perhaps there are any errors in SPSS?? I would be glad if anyone could help me! I am looking forward to hearing from you! Thank you! Nils __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] How to get Greenhouse-Geisser epsilons from anova?
Dear all, I apologize for my basic question. I try to calculate an anova for repeated measurements with 3 factors (A,B,C) having 2, 2, and 7 levels. or with an additional fourth between subjects factor D. Everything works fine using aov(val ~ A*B*C + Error(subject/ (A*B*C) ) ) or aov(val ~ (D*A*B*C) + Error(subject/(A*B*C)) + D ) val, A, B, C, D and subject are columns in a data.frame. How can I get the estimated Greenhouse-Geisser and Huynh-Feldt epsilons? I know Peter Dalgaard described it in R-News Vol. 7/2, October 2007. However, unfortunately I am not able to apply that using my data... Furthermore, I am still confused of how SPSS calculates the epsilons since it is mentioned that perhaps there are any errors in SPSS?? I would be glad if anyone could help me! I am looking forward to hearing from you! Thank you! Nils __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.