Hi, Znarf --

Every so often I find an occasion to include (SEE THE END OF THIS MESSAGE)
an earlier message from Mike  Palij related to the results of a study by
Jack Schmid about the RESTRICTION OF RANGE EFFECT ON THE
CORRELATION COEFFICIENT.

After many years of being around folks who were concerned about
RESTRICTION OF RANGE it became obvious to me that the correlation
coefficient should be used with EXTREME CAUTION.

-- Joe
****************************************************************************
****
Joe Ward.........................................Health Careers High School
167 East Arrowhead Dr....................4646 Hamilton Wolfe
San Antonio, TX 78228-2402...........San Antonio, TX 78229
Phone: 210-433-6575.......................Phone:  210-617-5400
Fax: 210-433-2828............................Fax: 210-617-5423
Email: [EMAIL PROTECTED]
http://www.ijoa.org/joeward/watdindex.html
***************************************************************************


----- Original Message -----
From: "Znarf Akfak" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Monday, July 10, 2000 2:41 AM
Subject: bivariate normality and correlation


> I'm considering reporting Pearson's correlation coefficient with a
> confidence interval for several bivariate associations.  As bivariate
> normality is assumed under the computation of the confidence interval,
> I have two questions.
>
> 1.  What is a good way to examine the assumption of bivariate normality
> for a given data set?
>
> 2.  To what extent are such confidence intervals robust to departures
> from bivariate normality?
>
> References to publications would be much appreciated as I don't have
> access to CIS, as would other suggestions and comments.
>
> Cheers,
>
> --
> Znarf
>
>
> Sent via Deja.com http://www.deja.com/
> Before you buy.
>
>
> =================================================================
> Instructions for joining and leaving this list and remarks about
> the problem of INAPPROPRIATE MESSAGES are available at
>                   http://jse.stat.ncsu.edu/
> =================================================================


======  INSERT BY JOE WARD OF MESSAGE FROM MIKE PALIJ
==========================================


---------- Forwarded message ----------
Date: Fri, 23 May 1997 09:30:20 -0400 (EDT)
From: Mike Palij <[EMAIL PROTECTED]>
To: [EMAIL PROTECTED], [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Subject: Re: Testing basic statistical concepts

I'd like to thank Joe Ward for reminding us of this situation
(his posting is appended below), as well as jogging my own
memory for a previous posting I had made.  A while back I
had posted the Anscombe dataset (in the context of an SPSS
program) which also clearly shows the benefit of plotting
the data:  the four situations produce almost identical
Pearson r values but only one actually shows the classic
scatterplot, the others show a nonlinear pattern and the
influence that a single point has on the calculation of r.
What does the value of r tell us here?  Aren't the basic
statistical concepts to be learned in this situation far
more important and most clearly seen through a coordination
of the graphical and numerical information?

-Mike Palij/Psychology Dept/New York University

Joe H Ward <[EMAIL PROTECTED]> writes:
 To Mike et al --

 There have been several message related to the Simple Correlation
 Coefficient.  IMHO, when out in the "real world" involving practical
 decision-making the correlation coefficient has very limited value and
 sometimes dangerous consequences.  The correlation coefficient may be
 an important topic for the history of statistics to learn the problems
 associated with its use .

 Attached below is an item that I submitted a long time ago, and it may be
 of interest to those following the discussion of "r".

 -- Joe
 ***********************************************************************
 * Joe Ward                                Health Careers High School  *
 * 167 East Arrowhead Dr.                4646 Hamilton Wolfe       *
 * San Antonio, TX 78228-2402        San Antonio, TX 78229       *
 * Phone: 210-433-6575                 Phone: 210-617-5400         *
 * [EMAIL PROTECTED]
 ***********************************************************************

 NON-RANDOM SAMPLING AND REGRESSION

    -- PROVIDED (MANY YEARS AGO) BY
  JACK SCHMID, UNIV. OF NORTHERN COLORADO, GREELEY, COLORADO

 y from (MU=0, SIGMA = 1.25)
 x from (MU=0, SIGMA = 1.00)
 RHOxy = .60

 Sample 10,000 cases at each level of progressive TRUNCATION ON x.

 Regression equation:  y = bx + a
                      _     _
 %Remaining   y     x   sigmay sigmax  r=BETA   b    a    Syx
 ________________________________________________________________
 100%           .01   .02    1.25   1.00     .60         .75  -.01  1.00
  90%           -.15  -.19    1.18    .85     .53         .74  -.01  1.00
  80%           -.27  -.35    1.15    .76     .49         .74  -.01  1.00
  70%           -.38  -.50    1.13    .70     .45         .73  -.02  1.01
  60%           -.49  -.65    1.11    .64     .42         .72  -.02  1.01
  50%           -.59  -.80    1.10    .59     .40         .74   .00  1.01
  40%           -.71  -.96    1.09    .55     .38         .76   .03  1.01
  30%           -.84 -1.15   1.08    .51     .36         .77   .04  1.01
  20%         -1.03 -1.39   1.06    .46     .33         .77   .03  1.00
  10%         -1.32 -1.75   1.04    .39     .28         .75  -.01  1.00

 ******* Students (or anyone who uses CORRELATION COEFFICIENTS) can observe
 that a correlation value can be made to have almost any value by
 "carefully" selecting the data or using data that has been truncated!
 However, the regression coefficient, b (slope of the line) and Syx are
 more stable under various restrictions on the data.
 -------------------------------





=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to