Robert Lundqvist <[EMAIL PROTECTED]> wrote in 
news:[EMAIL PROTECTED]:

> I found in one of the textbooks we use that calculating correlation
> coefficients is not meaningful when you have categorical data. However,
> using dummy variables should be possible, shouldn't it? Either when you
> have one ordinary numerica variable and one dummy, or even when you have
> two dummy variables. If not, could someone please put me in the right
> direction so I can stop be so hesitating in class...Comments are welcome,
> even if it turns out that I should have understood this.

It's certainly possible and the coefficients have meaning, but they can be 
hard to interpret.  Some simple algebra shows that the correlation between 
a dichotomous variable X and a continuous variable Y works out to 

R=sqrt(p*q)*(M1-M0)/S

where p is the proportion of X's that are 1, q is the proportion of X's 
that are zero, M1 is the mean of the Y's corresponding to X=1, M0 is the 
mean of the Y's corresponding to X=0, and S is the standard deviation of Y.

So it's actually a scaled version of a commonly-used measure of mean 
difference (Cohen's D), with the scaling depending on the X margins.  Thus 
talking about the correlation between an indicator and a continuous 
variable is really talking about the difference in mean of two groups.  The 
scaling factor means that the correlation coefficient may not be able to 
reach +/-1 for some X margins, i.e. proportions of the indicator.

*Testing* such a correlation is exactly equivalent to testing for a mean 
difference between two groups and will give the same results, but it would 
be rather strange to report the results as a test of correlation rather 
than an ordinary t-test for mean difference.  And it would make more sense 
to report D as the effect-size measure rather than R, since its 
interpretation doesn't depend on the X marginals.

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to