Let's see if I can get my stat 101 straight: We learned that linear regression has a set of assumptions:
1. Linearity of the relationship between X and y. 2. Independence of errors. 3. Homoscedasticity (equal error variance). 4. Normality of errors. Now, we should ask: Why are they needed? Can we get away with less? What if some of them are not met? It should be clear why we need #1. Without #2, I believe the least squares estimator is still unbias, but the usual estimate of SEs for the coefficients are wrong, so the t-tests are wrong. Without #3, the coefficients are, again, still unbiased, but not as efficient as can be. Interval estimates for the prediction will surely be wrong. Without #4, well, it depends. If the residual DF is sufficiently large, the t-tests are still valid because of CLT. You do need normality if you have small residual DF. The problem with normality tests, I believe, is that they usually have fairly low power at small sample sizes, so that doesn't quite help. There's no free lunch: A normality test with good power will usually have good power against a fairly narrow class of alternatives, and almost no power against others (directional test). How do you decide what to use? Has anyone seen a data set where the normality test on the residuals is crucial in coming up with appriate analysis? Cheers, Andy > From: Federico Gherardini > > Berton Gunter wrote: > > >>>Exactly! My point is that normality tests are useless for > this purpose for > >>>reasons that are beyond what I can take up here. > >>> > Thanks for your suggestions, I undesrtand that! Could you > possibly give > me some (not too complicated!) > links so that I can investigate this matter further? > > Cheers, > > Federico > > >>>Hints: Balanced designs are > >>>robust to non-normality; independence (especially > "clustering" of subjects > >>>due to systematic effects), not normality is usually the > biggest real > >>>statistical problem; hypothesis tests will always reject > when samples are > >>>large -- so what!; "trust" refers to prediction validity > which has to do > >>>with study design and the validity/representativeness of > the current data to > >>>future. > >>> > >>>I know that all the stats 101 tests say to test for > normality, but they're > >>>full of baloney! > >>> > >>>Of course, this is "free" advice -- so caveat emptor! > >>> > >>>Cheers, > >>>Bert > >>> > >>> > >>> > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ [EMAIL PROTECTED] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html