Hi Brett, Herman et al --
Occasionally it seems appropriate to send some results that help
reinforce the idea that the correlation coefficient can be of limited value
in some situations.
The table shown below illustrates what happens when the range of
X is restricted.
-- Joe
************************************************************************
* Joe Ward Health Careers High School *
* 167 East Arrowhead Dr 4646 Hamilton Wolfe *
* San Antonio, TX 78228-2402 San Antonio, TX 78229 *
* Phone: 210-433-6575 Phone: 210-617-5400 *
* Fax: 210-433-2828 Fax: 210-617-5423 *
* [EMAIL PROTECTED] *
* http://www.ijoa.org/joeward/wardindex.html *
************************************************************************
----- Original Message -----
From: Magill, Brett <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>; <[EMAIL PROTECTED]>
Sent: Friday, May 19, 2000 12:46 PM
Subject: Regression and Correlation (Was Correlation)
| I am no statistician, so let me make sure I am understanding what you are
| saying. Your point is that you may have an identical regression equation
| despite the fact that the correlation may vary depending on the amount of
| variation in X. If this is your point, I agree and recognize this--r is a
| measure of the fit about the regression line.
|
| Nonetheless, regression and correlation are the same in the bivariate case
| with the exception of scale. In a bivariate regression, the standardized
| Beta coefficient is equal to the Pearson r. As with any standardization, it
| removes the scale of the variation and the result is that the slope
| describes the relationship or B = r.
|
| Brett
|
BEGIN HERMAN'S MESSAGE
| -----Original Message-----
| From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
| Sent: Friday, May 19, 2000 11:43 AM
| To: [EMAIL PROTECTED]
| Subject: Re: Correlation
|
|
| Magill, Brett <[EMAIL PROTECTED]> wrote:
| >Mike,
|
| >In the bivariate case, regression and correlation are identical.
|
| This is false. Correlation is the measure of the
| proportion of the variance of one variable explained by a
| linear function of the other in a joint distribution, while
| linear regression is the linear relation itself. One can
| have non-linear versions as well.
|
| If in fact E(Y|X) = aX + b, this will also be the case no
| matter how selection is made on X, whereas the correlation
| can vary greatly.
|
-----------------------------------------------------------------------
END OF HERMAN'S MESSAGE
-------------------------------------------------------------------------
Beginning of insert by Joe Ward
-------------------------------------------------------------------------
---------- Forwarded message ----------
Date: Fri, 23 May 1997 09:30:20 -0400 (EDT)
From: Mike Palij <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Testing basic statistical concepts
I'd like to thank Joe Ward for reminding us of this situation
(his posting is appended below), as well as jogging my own
memory for a previous posting I had made. A while back I
had posted the Anscombe dataset (in the context of an SPSS
program) which also clearly shows the benefit of plotting
the data: the four situations produce almost identical
Pearson r values but only one actually shows the classic
scatterplot, the others show a nonlinear pattern and the
influence that a single point has on the calculation of r.
What does the value of r tell us here? Aren't the basic
statistical concepts to be learned in this situation far
more important and most clearly seen through a coordination
of the graphical and numerical information?
-Mike Palij/Psychology Dept/New York University
Joe H Ward <[EMAIL PROTECTED]> writes:
To Mike et al --
There have been several message related to the Simple Correlation
Coefficient. IMHO, when out in the "real world" involving practical
decision-making the correlation coefficient has very limited value and
sometimes dangerous consequences. The correlation coefficient may be
an important topic for the history of statistics to learn the problems
associated with its use .
Attached below is an item that I submitted a long time ago, and it may be
of interest to those following the discussion of "r".
-- Joe
***********************************************************************
* Joe Ward Health Careers High School *
* 167 East Arrowhead Dr. 4646 Hamilton Wolfe *
* San Antonio, TX 78228-2402 San Antonio, TX 78229 *
* Phone: 210-433-6575 Phone: 210-617-5400 *
* [EMAIL PROTECTED] Fax : 210-617-5423 *
***********************************************************************
NON-RANDOM SAMPLING AND REGRESSION
-- PROVIDED (MANY YEARS AGO) BY
JACK SCHMID, UNIV. OF NORTHERN COLORADO, GREELEY, COLORADO
y from (MU=0, SIGMA = 1.25)
x from (MU=0, SIGMA = 1.00)
RHOxy = .60
Sample 10,000 cases at each level of progressive TRUNCATION ON x.
Regression equation: y = bx + a
_ _
%Remaining y x sigmay sigmax r=BETA b a Syx
________________________________________________________________
100% .01 .02 1.25 1.00 .60 .75 -.01 1.00
90% -.15 -.19 1.18 .85 .53 .74 -.01 1.00
80% -.27 -.35 1.15 .76 .49 .74 -.01 1.00
70% -.38 -.50 1.13 .70 .45 .73 -.02 1.01
60% -.49 -.65 1.11 .64 .42 .72 -.02 1.01
50% -.59 -.80 1.10 .59 .40 .74 .00 1.01
40% -.71 -.96 1.09 .55 .38 .76 .03 1.01
30% -.84 -1.15 1.08 .51 .36 .77 .04 1.01
20% -1.03 -1.39 1.06 .46 .33 .77 .03 1.00
10% -1.32 -1.75 1.04 .39 .28 .75 -.01 1.00
******* Students (or anyone who uses CORRELATION COEFFICIENTS) can observe
that a correlation value can be made to have almost any value by
"carefully" selecting the data or using data that has been truncated!
However, the regression coefficient, b (slope of the line) and Syx are
more stable under various restrictions on the data.
-------------------------------
===========================================================================
This list is open to everyone. Occasionally, less thoughtful
people send inappropriate messages. Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.
For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===========================================================================