- I am nearly done with this topic - On 18 Jun 2003 08:32:55 -0700, [EMAIL PROTECTED] (dave martin) wrote:
> Rich Ulrich <[EMAIL PROTECTED]> wrote on 6/17/03 3:02:20 PM: > > >As we have said several times, > >"Adding one more parameter", if that is what you are > >doing, gives a nested model, where the F-test is proper > >and well-known (given: other assumptions). > > I agree & point out that I'm not adding one more parameter, rather, > I'm using a different model. NOW -- you are over-snipping, and mis-citing yourself. Here is what you wrote, in the two sentences prior to what you quoted of me, above. "On the other hand, many many researchers use the F-test to see if adding one more parameter to a model is beneficial. In my experience they usually (I've seen no counterexample) use the same data to generate the two mean square errors." > > See Bevington, "Data Reduction and Error Analysis for the Physical > Sciences", McGraw-Hill, 1969. p196 ff. In particular, p200 ff briefly > discuss the derivation of the nested form to which you refer. Yes, Page 200 briefly shows the testing of the nested model. Are you asking me do discuss it? I assume that you are, since you do not read it correctly on your own. - Whenever you add a parameter; the residual is decreased; the amount of decrease is distributed as chisquared with degrees of freedom (DF) equal to 1, for 1 parameter, and so on. The Decrease is the *numerator*; the Residual is the denominator. Those are independent (in the sense of 'independence' that is required), so that the ratio with Terms-divided-by-DF is "distributed as F" -- as we say. > > It seems to me that adding one parameter is a special case of using a > different model. Huh! Yes, as I have said in 3 notes now, it is THE special case that gives the test that everyone uses when they can. > Note that I'm not using the special difference form > of the F test; in my case the numerator and denominator are the > reduced chisquares for the two models. I noted that before; the ratio seems to be something that you have invented yourself; it is not something that I have ever heard of anyone using. You asked before whether you could form that ratio if the data were from different samples: YES. However, as a practical matter, that would be confusing. The one place where the "variance ratio" test used to be in a statistical package was in the routine that SPSS (for one) had for doing the t-test. That is, it is nice to know whether the variances are unequal, since the t test is less robust when Ns and variances are different by a lot. Also, some people mistakenly want to use that test-for-variances, as a precondition as to which version of the t-test to look at. Anyway, SPSS used to give the ratio of the larger-over-smaller variance, as the test for variances. (As it happens, it is not a very appropriate for the task so SPSS now has a different test for that purpose.) If you took two samples and compared their residual variances with the SAME parameterization, you would test whether those residuals were equal; that might be a funny sort of hypothesis, but it would be legitimate. Now, you get into something funnier when you use one parameterization for one sample, and another for the other. Yes, you could get a couple of versions of legitimate F-tests. But they both would confound the hypothesis about 'parameters' with the funny hypotheses about native sample differences. > > As another example of comparing models, say I'm faced with determining > thermal properties from temperature distribution data in a cooling > sphere. One model is that the temperature distribution is parabolic > while another model is a cosine function; there is only one parameter > in each model. I suggest that the F-test can in fact be used to see if > these two models can be distinguished from the data (to paraphrase R > Dodier, Is the dataset sufficiently large to distinguish between the > two models?). > > I agree about the need for independent numerator and denominator. Is > that requirement somehow relaxed when using the nested model with one > additional parameter approach? In the Nested model, the *numerator* is the ONE d.f. and the denominator is the residual of the fuller model. I showed last time how you could write either one of your models as a nested version of the other, by incorporating a parameter as a power of 1/T . > > >have a test, so it lends itself to the AIC or BIC -- those > >are attempts at borrowing the logic of the > >(somewhat-similar) nested tests. > > I looked into the AIC & BIC approaches as soon as you suggested them. > They look useful. I've not yet seen how one can ascribe a level of > confidence to a difference in AIC measures. I'd appreciate it if > you'd provide a pointer to such a discussion. > > >I think that the AIC and BIC differ mainly in how much > >they penalize extra degrees of freedom. > > > > PS I find the weight ascribed to additional degrees of freedom in a > chisquare or F-test disquieting. It somehow doesn't seem fair to give > as much weight to an additional data point as one gives to an > additional parameter which operates on all the data. "... as much weight"... Weight is not the role. Numerator or denominator, each term estimates a "variance". A variance is a "sum", of squares, divided by its effective N. If you add a data point, you add to one version of N. If you add a parameter, you add to the OTHER, independent "version" of an N. In the simple ANOVA model, there is the Total-Sum-of-Squares, which is the sum of the Within and the Between . The F-test is the ratio of the (Between/ B_df) over the (Within/ W_df) . There are a lot of consistent pieces of the technical details that might help you figure this... any amount newly attributed to Between (or Regression) Sum of Squares has to reduce the Within -- since the Total is fixed. That holds for the d.f. as well as the SS. You have been, apparently, thoroughly at sea about testing. I don't have high hopes that this will have filled all the gaps. I can't give you in a few paragraphs the content of the first week or two of a course in statistical theory. You *might* be able to get some good from browsing the first few chapters of one or a few books on statistical theory. -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html "Taxes are the price we pay for civilization." Justice Holmes. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
