Dear Andy, At the risk of muddying the waters (and certainly without wanting to advocate the use of normality tests for residuals), I believe that your point #4 is subject to misinterpretation: That is, while it is true that t- and F-tests for regression coefficients in large sample retain their validity well when the errors are non-normal, the efficiency of the LS estimates can (depending upon the nature of the non-normality) be seriously compromised, not only absolutely but in relation to alternatives (e.g., robust regression).
Regards, John -------------------------------- John Fox Department of Sociology McMaster University Hamilton, Ontario Canada L8S 4M4 905-525-9140x23604 http://socserv.mcmaster.ca/jfox -------------------------------- > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of Liaw, Andy > Sent: Friday, October 15, 2004 11:55 AM > To: 'Federico Gherardini'; Berton Gunter > Cc: R-help mailing list > Subject: RE: [R] Testing for normality of residuals in a > regression model > > Let's see if I can get my stat 101 straight: > > We learned that linear regression has a set of assumptions: > > 1. Linearity of the relationship between X and y. > 2. Independence of errors. > 3. Homoscedasticity (equal error variance). > 4. Normality of errors. > > Now, we should ask: Why are they needed? Can we get away > with less? What if some of them are not met? > > It should be clear why we need #1. > > Without #2, I believe the least squares estimator is still > unbias, but the usual estimate of SEs for the coefficients > are wrong, so the t-tests are wrong. > > Without #3, the coefficients are, again, still unbiased, but > not as efficient as can be. Interval estimates for the > prediction will surely be wrong. > > Without #4, well, it depends. If the residual DF is > sufficiently large, the t-tests are still valid because of > CLT. You do need normality if you have small residual DF. > > The problem with normality tests, I believe, is that they > usually have fairly low power at small sample sizes, so that > doesn't quite help. There's no free lunch: A normality test > with good power will usually have good power against a fairly > narrow class of alternatives, and almost no power against > others (directional test). How do you decide what to use? > > Has anyone seen a data set where the normality test on the > residuals is crucial in coming up with appriate analysis? > > Cheers, > Andy > > > From: Federico Gherardini > > > > Berton Gunter wrote: > > > > >>>Exactly! My point is that normality tests are useless for > > this purpose for > > >>>reasons that are beyond what I can take up here. > > >>> > > Thanks for your suggestions, I undesrtand that! Could you possibly > > give me some (not too complicated!) links so that I can investigate > > this matter further? > > > > Cheers, > > > > Federico > > > > >>>Hints: Balanced designs are > > >>>robust to non-normality; independence (especially > > "clustering" of subjects > > >>>due to systematic effects), not normality is usually the > > biggest real > > >>>statistical problem; hypothesis tests will always reject > > when samples are > > >>>large -- so what!; "trust" refers to prediction validity > > which has to do > > >>>with study design and the validity/representativeness of > > the current data to > > >>>future. > > >>> > > >>>I know that all the stats 101 tests say to test for > > normality, but they're > > >>>full of baloney! > > >>> > > >>>Of course, this is "free" advice -- so caveat emptor! > > >>> > > >>>Cheers, > > >>>Bert > > >>> > > >>> > > >>> > > > > ______________________________________________ > > [EMAIL PROTECTED] mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide! > > http://www.R-project.org/posting-guide.html > > > > > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html