In article <[EMAIL PROTECTED]>, Spuzzz <[EMAIL PROTECTED]> wrote: >OK all you statistical junkies are going to jump on my case now. From >what I understand this is one of, if not the most important proofs in >all of statistics. And I just don't get it.
>I remember learning it in college, where I must admit, I was more >interested in getting a passing grade, than truly understanding the >concept. Now I would like to change that. >So I understand the mathematical conclusions that are drawn -- that by >drawing enough samples means, you will approach a normal distribution >even if the population distribution is not normal. It's kind of cool, >but what I don't understand is, what is the practical implication? We >are talking about taking multiple samples and then looking at the >distribution of the mean of each sample. You are correct. The distribution of the sample mean for distributions with finite variances APPROACHES a normal distribution, but is not a normal distribution unless the original distribution is normal. >This would be very useful if it said something like, the means and/or >variances of a sample are similar to the means/variances of a >population. But it doesn't say that! It only talks about the means >of multiple samples having a normal distribution. >Here's one thing that I read: >"...we can use the normal distribution with virtually any population >we are likely to encounter, all we have to do is grab a nice big >sample and take the average of the sample..." Whoever wrote this does not understand the CLT. >So what is so great about using a normal distribution? It must be the >basis of later statistical techniques. But statistician's are >frequently excited about the CLT, as if by itself it has obvious >practical applications (like say the pythagorean theorem)... Those who understand probability are not that excited about the CLT, as they know about the errors. However, the CLT and the results about well-behaved transformations does give results about the limiting distribution of statistical analyses, and this is often all that can be reasonably used. For example, in regular problems, the twice the logarithm of the likelihood ratio, in likelihood ratio tests, is asymptotically chi-squared with the usual numbers of degrees of freedom. Tests for regression coefficients are asymptotically correct. Tests for correlations for values other than 0 are not. Least square and similar procedures are good despite lack of normality, but they are not maximum likelihood, provided that one does not disturb the underlying linear structure. Any attempt to make the observations more normal is likely to destroy any useful structure. Normality is NOT usually important, and probably never occurs in nature. >Please excuse my ignorance... Ignorance is not a sin. Ignoring it is. -- This address is for information only. I do not claim that these views are those of the Statistics Department or of Purdue University. Herman Rubin, Department of Statistics, Purdue University [EMAIL PROTECTED] Phone: (765)494-6054 FAX: (765)494-0558 . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
