On 15 Jun 2003 00:04:35 -0700, [EMAIL PROTECTED] (Mohammad Ehsanul Karim) wrote:
> Rich Ulrich <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > > You do need to meet assumptions in order to trust the > > statistical tests. > > You need to meet some additional assumptions in order > > to trust the implications of the regressions coefficients. > Additional assumption .. such as ..? > ... I'll summarize that another way. Absolutely no assumptions are needed in order to *perform* a regression, so long as you don't run into illegal arithmetic - divide by zero, etc. I'm saying, if you can compute it, then it is legal to compute it. However, there are different stages of generalizing. It is rational to conclude, that you can draw more conclusions from "better data" -- whatever "better" means. - large Ns with good randomness from the universe of interest; continuous scores; no outliers; meaningful, linear scaling; ... "broken assumptions" is one way to say, Here's why that one did not come out right. There are no assumptions needed to DO a regression. There are certain numerical assumptions (or limits) behind creating a valid statistical test - large enough N; independence of errors; a range of scoring; ... not much else. There are further assumptions or adjustments before accepting the point estimates of effect sizes - correction for attenuation, for instance, if the predictors are "measured with error". But tests are often unaffected by *that* sort of bias in estimates. There are harder assumptions to meet, before accepting the logic of causation. The logic is open to argument about "outside effects", any time that you don't start with a randomized trial. Thus, we can compare Males to Females; if the difference is 5 points [of something], that might be a good estimate or an underestimate, based on scoring reliability. On the other hand, it might be unjustified to attribute the effect to "gender" if someone can come up with an outside reason that is more basic [ physical size, say] to account for the measured difference. The situation is somewhat asymmetrical when we compare M to F for a huge sample. If *no* effect is 'significant', we conclude that there "probably" is not any sizable difference. If *some* effect is demonstrated, then the whole world is invited to make suggestions to account for it. Thus, the Anglo-American psychologists doing intelligence testing were awed, fairly early on, when WOMEN were *not* found to be vastly inferior to men. But the same testers were happy and pleased to show the *apparent* inferiority of everyone who was not White and northern European. - explicable, with more care, as mass-testing of emigrants who didn't do well on tests because they did not speak English. Or maybe did not read, in any language. Anyway -- I don't think I want to place all assumptions on the same level. Regression is easy. "Causation" is tough. What would be great, I think, is if someone devised a way to quantify how badly an assumptions has fared. "Normality tests" don't tell us how much the non-normality has hurt. - the "test on variances" for the t-test is too powerful to be useful when the samples are large, and too weak when the samples are small. What will index, instead, "How much does it matter?" The Satterthwaite-Welch "adjusted d.f." gives a certain index for the t-test. Should we look at it more, or more often? -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html "Taxes are the price we pay for civilization." Justice Holmes. . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
