On 11 Jun 2002 05:51:54 -0700, [EMAIL PROTECTED] (Katja
Wunderlich) wrote:

> Hi!
> 
> Is it OK to do a regression analysis, when the sizes of the sub
> samples are very unequal?
> 
> This is our sample with the sub samples:
[ snip.   6 groups with Ns from 101 to 500]

The problem is:  What sort of statements can you make?
What sort do you want to make?  What is the alternative?
[By the way, a 5-fold difference is not a huge one.]

And what is the role of your sub-samples?  That is,
you might be directly comparing them; or you might
be expecting consistency of some other prediction.
You might be willing to contrast the largest group to
all the others.  

If you are mainly interested in saying, 
"Here is a striking effect"  or  "We don't see anything" --
the variation in Ns (it seems to me) matter very little.  

Unequal Ns potentially open up problems of interpretation.
For instance, the p-levels of differences (say, for dummy 
coefficients for Nations)  might not 'show the same thing'  as
looking at the raw magnitudes of differences.  But that 
does not mean we throw out some data to avoid the problem.
We can do the regression.  One way to think of it, I believe, 
is that drawing conclusions with unequal Ns  (sometimes) 
becomes similar to comparing two studies.


-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to