This is a multi-part message in MIME format.
------=_NextPart_000_0054_01BF551B.4D1409A0
Content-Type: text/plain;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
This practical question arose between myself and a colleague at work. =
It concerns whether we can use correlation analysis if one of the =
variables is non-continuous or "categorical." She believes that both =
variables must be continuous. However she cannot say why, and I cannot =
find any such constraint in the statistics book I have relied on since =
graduating in Industrial Engineering a few years ago, Miller and Freund, =
'Probability and Statistics for Engineers.' =20
I have been thinking that if x is discrete and can assume only a few =
values compared with y which is continuous, the correlation study may =
yield a high probability of type-one error. I interpret this as =
providing insufficient evidence with which to reject the null =
hypothesis. But I have not thought of this as an inappropriate use of =
correlation. =20
On the other hand in attempting to probe Miller and Freund I find that =
correlation is based on the "bivariate normal distribution," the =
formula for which has numerous parameters including alpha and beta, the =
least squares regression coefficients. I am aware that to obtain the =
latter requires that the function be differentiable, hence x must also =
be continuous. This seems to support my friend's view.
I would appreciate clarification of any such constraints on the =
practical use of correlation analysis. Also, if anyone can recommend a =
textbook that addresses questions such as this more directly than Miller =
and Freund, I would appreciate that also.
------=_NextPart_000_0054_01BF551B.4D1409A0
Content-Type: text/html;
charset="iso-8859-1"
Content-Transfer-Encoding: quoted-printable
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">
<HTML><HEAD>
<META content=3D"text/html; charset=3Diso-8859-1" =
http-equiv=3DContent-Type>
<META content=3D"MSHTML 5.00.2614.3500" name=3DGENERATOR>
<STYLE></STYLE>
</HEAD>
<BODY bgColor=3D#ffffff>
<DIV><FONT size=3D2>This practical question arose between myself =
and a=20
colleague at work. It concerns whether we can use correlation=20
analysis if one of the variables is non-continuous or=20
"categorical." </FONT><FONT size=3D2>She believes that both=20
variables must be continuous. However she cannot =
say=20
why, and I cannot find any such constraint in the statistics =
book I=20
have relied on since graduating in Industrial Engineering a few years =
ago,=20
Miller and Freund, 'Probability and Statistics for Engineers.' =20
</FONT></DIV>
<DIV> </DIV>
<DIV><FONT size=3D2>I have been thinking that if x is discrete and=20
can assume only a few values compared with y which is continuous, =
the=20
correlation study may yield a high probability of type-one error. =
I=20
interpret this as providing insufficient evidence with which to =
reject the=20
null hypothesis. But I have not thought of this as an=20
inappropriate use of correlation. </FONT></DIV>
<DIV> </DIV>
<DIV><FONT size=3D2><FONT size=3D2>On the other hand in attempting =
to probe=20
Miller and Freund I find that correlation is based on the =
"bivariate=20
normal distribution," the formula for which has numerous =
parameters=20
including alpha and beta, the least squares regression=20
coefficients. I am aware that to obtain the latter requires =
that=20
the function be differentiable, hence x must also=20
be continuous. This seems to support my friend's=20
view.</FONT></FONT></DIV>
<DIV> </DIV>
<DIV><FONT size=3D2>I would appreciate clarification of any such =
constraints on=20
the practical use of correlation analysis. Also, if anyone can =
recommend a=20
textbook that addresses questions such as this more directly than =
Miller=20
and Freund, I would appreciate that also.</FONT></DIV>
<DIV> </DIV></BODY></HTML>
------=_NextPart_000_0054_01BF551B.4D1409A0--