Robert Lundqvist wrote:
>
> I found in one of the textbooks we use that calculating correlation
> coefficients is not meaningful when you have categorical data. However,
> using dummy variables should be possible, shouldn't it? Either when you
> have one ordinary numerica variable and one dummy, or even when you have
> two dummy variables. If not, could someone please put me in the right
> direction so I can stop be so hesitating in class...Comments are welcome,
> even if it turns out that I should have understood this.
Oh, it's _possible_, all right. It's just not *meaningful*, because
there are many ways to assign dummy variables to the levels of the
categorical variable
and typically each will give a different result. Is "banana" between
"apple" and "orange" or not?
There is a sort of exception when there are two levels, in which case
all ways of labelling are equivalent up to linear transformation; but
there are better ways to deal with this special case.
-Robert Dawson
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
. http://jse.stat.ncsu.edu/ .
=================================================================