- I am nearly done with this topic -
On 18 Jun 2003 08:32:55 -0700, [EMAIL PROTECTED] (dave martin)
wrote:

> Rich Ulrich <[EMAIL PROTECTED]> wrote on 6/17/03 3:02:20 PM:
> 
> >As we have said several times, 
> >"Adding one more parameter", if that is what you are 
> >doing, gives a nested model, where the F-test is proper 
> >and well-known (given:  other assumptions).
> 
> I agree & point out that I'm not adding one more parameter, rather,
> I'm using a different model.

NOW -- you are over-snipping, and mis-citing yourself.  
Here is what you wrote, in the two sentences prior to what you 
quoted of me, above.

"On the other hand, many many researchers use the F-test 
to see if adding one more parameter to a model is beneficial. 
In my experience they usually (I've seen no counterexample) 
use the same data to generate the two mean square errors."


> 
> See Bevington, "Data Reduction and Error Analysis for the Physical
> Sciences", McGraw-Hill, 1969. p196 ff.  In particular, p200 ff briefly
> discuss the derivation of the nested form to which you refer.

Yes, Page 200 briefly shows the testing of the nested model.
Are you asking me do discuss it?  I assume that you are, since 
you do not read it correctly on your own.
 - Whenever you add a parameter; the residual is decreased;
the amount of decrease is distributed as chisquared with degrees
of freedom (DF)  equal  to 1, for 1 parameter, and so on.

The Decrease is the *numerator*; the Residual  is the denominator.
Those are independent (in the sense of 'independence'
that is required), so that the ratio with Terms-divided-by-DF 
is "distributed  as F" -- as we say.


> 
> It seems to me that adding one parameter is a special case of using a
> different model. 

Huh!  Yes, as I have said in 3 notes  now, it is THE special case 
that gives the test that everyone uses when they can.

>                      Note that I'm not using the special difference form
> of the F test; in my case the numerator and denominator are the
> reduced chisquares for the two models.

I noted that before; the ratio seems to be something that you
have invented yourself; it is not something that I have 
ever heard of anyone using.

You asked before whether you could form that ratio
if the data were from different samples:  YES.
However, as a practical matter, that would be 
confusing.  The one place where the "variance ratio" 
test used to be in a statistical package was in the routine
that SPSS (for one)  had for doing the t-test.  That is,
it is nice to know whether the variances are unequal,
since the t test is less robust when Ns and  variances
are different by a lot.  Also, some people mistakenly 
want to use that test-for-variances, as a precondition
as to which version of the t-test to look at.  Anyway,
SPSS used to give the ratio of the larger-over-smaller
variance, as  the test for variances.  (As it happens, it
is not a very appropriate for the task so SPSS now has
a different test for that purpose.)

If you took two samples and compared their residual
variances with the SAME parameterization, you would
test whether those residuals were equal; that might be
a funny sort of hypothesis, but it would be legitimate.
Now, you get into something funnier when you use one
parameterization for one sample, and another for the
other.  Yes, you could get a couple of versions of 
legitimate F-tests.  But they both would confound the 
hypothesis about 'parameters'  with the funny hypotheses
about native sample differences.


> 
> As another example of comparing models, say I'm faced with determining
> thermal properties from temperature distribution data in a cooling
> sphere. One model is that the temperature distribution is parabolic
> while another model is a cosine function; there is only one parameter
> in each model. I suggest that the F-test can in fact be used to see if
> these two models can be distinguished from the data (to paraphrase R
> Dodier, Is the dataset sufficiently large to distinguish between the
> two models?).
> 
> I agree about the need for independent numerator and denominator.  Is
> that requirement somehow relaxed when using the nested model with one
> additional parameter approach?

In the Nested model, the *numerator*  is the ONE  d.f.  and the
denominator is the residual of the fuller model.  I showed last
time how you could write either one of your models as a
nested version of the other, by incorporating a parameter as
a power of 1/T .

> 
> >have a test, so it lends itself to the AIC  or BIC -- those 
> >are attempts at borrowing the logic of the 
> >(somewhat-similar)  nested tests.  
> 
> I looked into the AIC & BIC approaches as soon as you suggested them.
> They look useful.  I've not yet seen how one can ascribe a level of
> confidence to a difference in AIC measures.  I'd appreciate it if
> you'd provide a pointer to such a discussion.
> 
> >I think that the AIC and BIC differ mainly in how much 
> >they penalize extra degrees of freedom. 
> >
> 
> PS I find the weight ascribed to additional degrees of freedom in a
> chisquare or F-test disquieting. It somehow doesn't seem fair to give
> as much weight to an additional data point as one gives to an
> additional parameter which operates on all the data.

"... as much weight"... Weight  is not the role.  
Numerator or denominator, each term estimates a "variance".
A variance is  a "sum", of squares, divided by its effective N.

If you add a data point, you add to one version of N.
If you add a parameter, you add to the OTHER, 
independent "version" of an  N.  In the simple ANOVA
model, there is the Total-Sum-of-Squares, which is
the sum of the Within  and the Between .  The F-test
is the ratio of the (Between/ B_df)  over the 
(Within/ W_df) .

There are a lot of consistent pieces of the technical 
details that might help you figure this...  any amount newly 
attributed to Between (or Regression) Sum of Squares 
has to reduce the Within -- since the Total is fixed.  
That holds for the d.f.  as well as the SS.


You have been, apparently, thoroughly at sea about 
testing.  I don't have high hopes that this will have filled
all the gaps.  I can't give you in a few paragraphs the content 
of the first week or two of a course in statistical theory.

You  *might*  be able to get some good from browsing 
the first few chapters of one or a few books on statistical
theory.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization."  Justice Holmes.
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to