> Exactly - elementary texts and methods books recommend the welch test > for the reason you mention. Curiously, those same texts recommend > using anova and regression without automatically correcting for the > possibility of non-constant variance. Why is the case of comparing > two means different from 3? Those same books will tell you that anova > is pretty robust to non-constant variance. well, the two sample > t-test is anova.
I agree with you that the presentation is unfortunate. Perhaps it has something to do with the fact that heteroskedastic consistent covariance matrices (HCCM) for linear regression are a relatively recent development (by White and Huber in the early 80s), and initially they performed poorly for small sample sizes. From a pedegogy standpoint the derivations of the formulas for HCCM are beyond the scope of an undergraduate course whereas the equal variance versions can be easily derived. Given more recent simulation studies showing the power and level of tests based on HCCM are comparable with equal variance regression, and that there is rarely any reason to apriori think that the variances are equal. The anova is robust to violations so long as the group sizes are equal. if they aren't then it isn't. > > I don't use the welch test except as a conscious decision: ie I really > want to compare the means while suspecting that the variances differ. > Generally people are using the t test to certify that two populations > are different. If the variances are wildly different, that may be > much more important than a difference in means. in fact, to test for > a difference in means when the variances are wildly different is > almost always substantively silly. There was a great example a few > years ago from a psychiatric journal, comparing two medications, where > the investigators did a t-test for the means when one distribution was > unimodal and the other was bi-modal; there was no statistically > significant difference in the means, but there was a really important > difference in the distributions. The automatic use of the welch test > makes you feel that you are protected against Bad Things, when you > aren't. You may not suspect that the variances are different, but there is no apriori reason to think that they are equal. Why should you assume something you have no reason to believe is true? In my experience, people are not using the t-test to say that two populations are in some general way different, but rather specifically that the means vary. This is an important question regardless of whether the variances are equal. In your medication example, the shape of the two distributions was different, but when making the decision of whether to approve a medication, the more important question is whether the central tendency is different. Does one medication on average improve the outcome more than another. A secondary, though important, question is how variable the outcome is. The investigators made a correct inference (in stating no significant mean difference between the groups), but they missed an important question that they could have asked their data. This omission has nothing to do with the t-test. Using heteroskedastic robust methods DO protect against "Bad Things." What they don't do is reveal the existence important data trends unrelated to their hypothesis of interest. Ian _______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-sig-teaching
