On 15 Jun 2003 00:04:35 -0700, [EMAIL PROTECTED] (Mohammad
Ehsanul Karim) wrote:

> Rich Ulrich <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> > You do need to meet assumptions in order to trust the
> > statistical tests.
> > You need to meet some additional assumptions in order
> > to trust the implications of the regressions coefficients.
 
> Additional assumption .. such as ..?
> 
 ... 
I'll summarize that another way.

Absolutely no assumptions are needed in order
to *perform*  a regression, so long as you don't 
run into illegal arithmetic - divide by zero, etc.
I'm saying, if you can compute it, then it is legal to
compute it.

However, there are different stages of generalizing.

It is rational to conclude, that you can draw more 
conclusions from "better data" -- whatever "better" means.
 - large Ns with good randomness from the universe of
interest; continuous scores;  no outliers;  meaningful,
linear scaling; ...

"broken assumptions"  is one way to say, Here's why
that one did not come out right.



There are no assumptions needed to DO  a regression.

There are certain numerical assumptions (or limits)
behind creating a valid statistical test - large enough
N;  independence of errors; a range of scoring;
 ... not much else.

There are further assumptions or adjustments before
accepting the point estimates of effect sizes - correction for
attenuation, for instance, if the predictors are "measured
with error".   But tests are often unaffected by *that*  sort
of bias in estimates.

There are harder assumptions to meet, before accepting
the logic of causation.  The logic is open to argument about
"outside effects",  any time that you don't start with a
randomized trial.  Thus, we can compare Males to Females;
if the difference is 5 points [of something], that might be a
good estimate or an underestimate, based on scoring
reliability.  On the other hand, it might be  unjustified
to attribute the effect to "gender"  if someone can come up
with an outside reason that is more basic [ physical size, 
say]  to account for the  measured difference.

The situation is somewhat asymmetrical when we compare
M to F  for a huge sample.  If  *no*  effect  is 'significant', we 
conclude that there "probably"  is not any sizable difference.
If *some*  effect is demonstrated, then the whole world
is invited to make suggestions to account for it.  

Thus, the Anglo-American  psychologists doing intelligence 
testing  were awed, fairly early on, when WOMEN  were  
*not*  found to be vastly inferior to men.     

But the same testers were happy and pleased to show 
the  *apparent*  inferiority of everyone who was not White 
and northern European.
 - explicable, with more care, as mass-testing of emigrants
who didn't do well on tests because they did not speak 
English.  Or maybe did not read, in any language.

Anyway -- I don't think I want to place all  
assumptions on the same level.  
Regression is easy.  "Causation"  is tough.

What would be great, I think, is if someone devised a
way to quantify how badly an assumptions has fared.
"Normality tests"  don't tell us how much the non-normality has hurt.
 - the "test on variances" for the t-test is too powerful to be
useful when the samples are large, and too weak when the 
samples are small.  What will index, instead, 
"How much does it matter?"   

The Satterthwaite-Welch  "adjusted d.f."  gives a certain
index for the t-test.  Should we look at it more, or more often?



-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization."  Justice Holmes.
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to