Rich Ulrich <[EMAIL PROTECTED]> wrote in message [EMAIL PROTECTED]">news:[EMAIL PROTECTED]... > On Wed, 3 Apr 2002 08:13:18 +0200, "Laurence" <[EMAIL PROTECTED]> > wrote: > > > Mr Ulrich, > > > > No it's only to make a Chi square test on residuals against a Chi square > > value. > > This test is needing to prove if the model i have find by least square > > method is valid or not. > > Then i've to verify is the residual distribution is Normal or not. > > I've to do a Chi square test. But in this test we have to find different > > frequencies in different intervals. > > This reminds me of why the Artificial Intelligence approach to > statistical consulting doesn't work yet, and why it will never > work with a small decision tree. > > It's true, that 'normal' is a useful condition for residuals. > It's true, that there is a chi-squared test for normality, > built on something like a contingency table. > > But it is true that the X^2 test is not a very powerful > test for normality, in general, and it is especially (in my > opinion) not very appropriate as a test on residuals, > since what matters for residuals is non-normality of > other sorts (and I don't know a test for them, either). > > - an extreme outlier or two may invalidate any > least-squares statistics, without disturbing that X^2. > > If I had to *test* for normality of residuals, I would certainly > prefer either of the other two popular tests (Shapiro-Wilks; > or K-S) over that one, by a large margin.
Of course, as Rich suggests, hypothesis tests of normality consider entirely the wrong thing. In so far as non-normality affects a procedure (and it is generally only an issue when making inferences, though efficiency of estimates might become a consideration in other circumstances), it is not the p-value of the test statistic that tells you about how serious the problem is. The test statistic is a measure of deviation from normality, and it's how non-normal your data is that's the concern, not it's statistical significance. Assuming, that is, your test statistic is a good measure of how much the non-normality affects your inference (and it usually won't be - certainly not in the case of the chi-square!). I sometimes use the Shapiro-Francia statistic as a rough guide -- but a high p-value doesn't mean all is okay and a low p-value doesn't mean all isn't okay. To make use of a statistic in this way, you also need to build up (from simulation, for example) some intuition about how much certain kinds of deviation from normality affect your results - so you'll still be wanting to look at what is causing the deviation. These decisions need to relate to what kind of inference you're making. If you want to make inference about the upper tail, some outliers in the lower tail won't matter -- but outliers in the upper tail could affect the inference substantially! (And if you're making inferences about the upper tail - say for an insurance application - you don't want to be using any of the standard tests of normality as a guide to the effects of some deviation from a normality assumption.) Glen . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
