On 6 Oct 2003 08:06:43 -0700, [EMAIL PROTECTED] (Lauren) wrote:

> Hello,
> I am interested in calculating the correlation coefficient between two
> variables, both continuous quantitative variables between 0 and 1. 
> However, I noticed that the dependent variable is nearly binary with a
> large amount of its values at o and 1 with only a small fraction
> in-between.  What effect will this have on the correlation
> coefficient?  Is it valid to use the correlation coefficient?
> Any help would be appreciated, thanks!

You can't have a perfect correlation between two
variables unless they are distributed with the same
spread and skew.  

The maximum   r  is noticeably reduced when you have
two dichotomies that are skewed (effectively)  in 
opposite directions.  The r  is still valid for testing,  but 
skews that go 'opposite' do limit the possible 'information'  
that can be contained, or the p-value that is deserved.
Unless there's just a couple of cases at the extreme,
the r  does give you a proper test, to a pretty good 
approximation.


Is the  r  'valid'  as representing the relationship?
Well, it will not denote exactly the same thing as a
correlation between two gaussian variables:  
How could it?

If the  distribution has been artificially made into the
(virtual)  dichotomy, then the r  has probably been
'attenuated'  by the information loss.  There are formulas
for de-attenuating, if you are strong enough arguing
statistics to justify doing so.  

 - You could look for the tetrachoric (or, other polychoric)
correlation.  
 - If the world is ever going to start using these, there 
should be more working examples than I have seen.
 - Folks doing Structural equation-models may have 
found an application, 


-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html
"Taxes are the price we pay for civilization." 
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to