Dear Amasco, Again, I'll answer briefly (since the written source that I previously mentioned has an extensive discussion):
> -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Amasco > Miralisus > Sent: Monday, August 28, 2006 2:21 PM > To: r-help@stat.math.ethz.ch > Cc: John Fox; Prof Brian Ripley; Mark Lyman > Subject: Re: [R] Type II and III sum of square in Anova (R, > car package) > > Hello, > > First of all, I would like to thank everybody who answered my > question. Every post has added something to my knowledge of the topic. > I now know why Type III SS are so questionable. > > As I understood form R FAQ, there is disagreement among > Statisticians which SS to use > (http://cran.r-project.org/doc/FAQ/R-FAQ.html#Why-does-the-out > put-from-anova_0028_0029-depend-on-the-order-of-factors-in-the > -model_003f). > However, most commercial statistical packages use Type III as > the default (with orthogonal contrasts), just as STATISTICA, > from which I am currently trying to migrate to R. This was > probably was done for the convenience of end-users who are > not very experienced in theoretical statistics. > Note that the contrasts are only orthogonal in the row basis of the model matrix, not, with unbalanced data, in the model matrix itself. > I am aware that the same result could be produced using the standard > anova() function with Type I "sequential" SS, supplemented by > drop1() function, but this approach will look quite > complicated for persons without any substantial background in > statistics, like no-math students. I would prefer easier way, > possibly more universal, though also probably more "for > dummies" :) If am not mistaken, car package by John Fox with > his nice Anova() function is the reasonable alternative for > any, who wish to simply perform quick statistical analysis, > without afraid to mess something with model fitting. Of > course orthogonal contrasts have to be specified (for example > contr.sum) in case of Type III SS. > > Therefore, I would like to reformulate my questions, to make > it easier for you to answer: > > 1. The first question related to answer by Professor Brian > Ripley: Did I understood correctly from the advised paper > (Bill Venables' > 'exegeses' paper) that there is not much sense to test main > effects if the interaction is significant? > Many are of this opinion. I would put it a bit differently: Properly formulated, tests of main effects in the presence of interactions make sense (i.e., have a straightforward interpretation in terms of population marginal means) but probably are not of interest. > 2. If I understood the post by John Fox correctly, I could safely use > Anova(.,type="III") function from car for ANOVA analyses in > R, both for balanced and unbalanced designs? Of course > providing the model was fitted with orthogonal contrasts. > Something like below: > mod <- aov(response ~ factor1 * factor2, data=mydata, > contrasts=list(factor1=contr.sum, > factor2=contr.sum)) Anova(mod, type="III") > Yes (or you could reset the contrasts option), but why do you appear to prefer the "type-III" tests to the "type-II" tests? > It was also said in most of your posts that the decision of > which of Type of SS to use has to be done on the basis of the > hypothesis we want to test. Therefore, let's assume that I > would like to test the significance of both factors, and if > some of them significant, I plan to use post-hoc tests to > explore difference(s) between levels of this significant factor(s). > Your statement is too vague to imply what kind of tests you should use. I think that people are almost always interested in "main effects" when interactions to which they are marginal are negligible. In this situation, both "type-II" and "type-III" tests are appropriate, and "type-II" tests would usually be more powerful. Regards, John > Thank you in advance, Amasco > > On 8/27/06, John Fox <[EMAIL PROTECTED]> wrote: > > Dear Amasco, > > > > A complete explanation of the issues that you raise is > awkward in an > > email, so I'll address your questions briefly. Section 8.2 > of my text, > > Applied Regression Analysis, Linear Models, and Related > Methods (Sage, > > 1997) has a detailed discussion. > > > > (1) In balanced designs, so-called "Type I," "II," and > "III" sums of > > squares are identical. If the STATA manual says that Type > II tests are > > only appropriate in balanced designs, then that doesn't > make a whole > > lot of sense (unless one believes that Type-II tests are nonsense, > > which is not the case). > > > > (2) One should concentrate not directly on different > "types" of sums > > of squares, but on the hypotheses to be tested. Sums of squares and > > F-tests should follow from the hypotheses. Type-II and > Type-III tests > > (if the latter are properly formulated) test hypotheses that are > > reasonably construed as tests of main effects and interactions in > > unbalanced designs. In unbalanced designs, Type-I sums of squares > > usually test hypotheses of interest only by accident. > > > > (3) Type-II sums of squares are constructed obeying the > principle of > > marginality, so the kinds of contrasts employed to > represent factors > > are irrelevant to the sums of squares produced. You get the same > > answer for any full set of contrasts for each factor. In > general, the > > hypotheses tested assume that terms to which a particular term is > > marginal are zero. So, for example, in a three-way ANOVA > with factors > > A, B, and C, the Type-II test for the AB interaction > assumes that the > > ABC interaction is absent, and the test for the A main > effect assumes > > that the ABC, AB, and AC interaction are absent (but not > necessarily > > the BC interaction, since the A main effect is not marginal to this > > term). A general justification is that we're usually not > interested, > > e.g., in a main effect that's marginal to a nonzero interaction. > > > > (4) Type-III tests do not assume that terms higher-order to > the term > > in question are zero. For example, in a two-way design with > factors A > > and B, the type-III test for the A main effect tests whether the > > population marginal means at the levels of A (i.e., averaged across > > the levels of B) are the same. One can test this hypothesis > whether or > > not A and B interact, since the marginal means can be > formed whether > > or not the profiles of means for A within levels of B are parallel. > > Whether the hypothesis is of interest in the presence of > interaction > > is another matter, however. To compute Type-III tests using > > incremental F-tests, one needs contrasts that are orthogonal in the > > row-basis of the model matrix. In R, this means, e.g., using > > contr.sum, contr.helmert, or contr.poly (all of which will give you > > the same SS), but not contr.treatment. Failing to be > careful here will > > result in testing hypotheses that are not reasonably > construed, e.g., as hypotheses concerning main effects. > > > > (5) The same considerations apply to linear models that include > > quantitative predictors -- e.g., ANCOVA. Most software will not > > automatically produce sensible Type-III tests, however. > > > > I hope this helps, > > John > > > > -------------------------------- > > John Fox > > Department of Sociology > > McMaster University > > Hamilton, Ontario > > Canada L8S 4M4 > > 905-525-9140x23604 > > http://socserv.mcmaster.ca/jfox > > -------------------------------- > > > > > -----Original Message----- > > > From: [EMAIL PROTECTED] > > > [mailto:[EMAIL PROTECTED] On Behalf Of Amasco > > > Miralisus > > > Sent: Saturday, August 26, 2006 5:07 PM > > > To: r-help@stat.math.ethz.ch > > > Subject: [R] Type II and III sum of square in Anova (R, > car package) > > > > > > Hello everybody, > > > > > > I have some questions on ANOVA in general and on ANOVA in R > > > particularly. > > > I am not Statistician, therefore I would be very > appreciated if you > > > answer it in a simple way. > > > > > > 1. First of all, more general question. Standard anova() function > > > for lm() or aov() models in R implements Type I sum of squares > > > (sequential), which is not well suited for unbalanced ANOVA. > > > Therefore it is better to use > > > Anova() function from car package, which was programmed > by John Fox > > > to use Type II and Type III sum of squares. Did I get the point? > > > > > > 2. Now more specific question. Type II sum of squares is not well > > > suited for unbalanced ANOVA designs too (as stated in STATISTICA > > > help), therefore the general rule of thumb is to use Anova() > > > function using Type II SS only for balanced ANOVA and Anova() > > > function using Type III SS for unbalanced ANOVA? > > > Is this correct interpretation? > > > > > > 3. I have found a post from John Fox in which he wrote > that Type III > > > SS could be misleading in case someone use some > contrasts. What is > > > this about? > > > Could you please advice, when it is appropriate to use > Type II and > > > when Type III SS? I do not use contrasts for comparisons, just > > > general ANOVA with subsequent Tukey post-hoc comparisons. > > > > > > Thank you in advance, > > > Amasco > > > > > > [[alternative HTML version deleted]] > > > > > > ______________________________________________ > > > R-help@stat.math.ethz.ch mailing list > > > https://stat.ethz.ch/mailman/listinfo/r-help > > > PLEASE do read the posting guide > > > http://www.R-project.org/posting-guide.html > > > and provide commented, minimal, self-contained, reproducible code. > > > > > > ______________________________________________ > R-help@stat.math.ethz.ch mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@stat.math.ethz.ch mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.