On 6 Oct 2003 08:06:43 -0700, [EMAIL PROTECTED] (Lauren) wrote: > Hello, > I am interested in calculating the correlation coefficient between two > variables, both continuous quantitative variables between 0 and 1. > However, I noticed that the dependent variable is nearly binary with a > large amount of its values at o and 1 with only a small fraction > in-between. What effect will this have on the correlation > coefficient? Is it valid to use the correlation coefficient? > Any help would be appreciated, thanks!
You can't have a perfect correlation between two variables unless they are distributed with the same spread and skew. The maximum r is noticeably reduced when you have two dichotomies that are skewed (effectively) in opposite directions. The r is still valid for testing, but skews that go 'opposite' do limit the possible 'information' that can be contained, or the p-value that is deserved. Unless there's just a couple of cases at the extreme, the r does give you a proper test, to a pretty good approximation. Is the r 'valid' as representing the relationship? Well, it will not denote exactly the same thing as a correlation between two gaussian variables: How could it? If the distribution has been artificially made into the (virtual) dichotomy, then the r has probably been 'attenuated' by the information loss. There are formulas for de-attenuating, if you are strong enough arguing statistics to justify doing so. - You could look for the tetrachoric (or, other polychoric) correlation. - If the world is ever going to start using these, there should be more working examples than I have seen. - Folks doing Structural equation-models may have found an application, -- Rich Ulrich, [EMAIL PROTECTED] http://www.pitt.edu/~wpilib/index.html "Taxes are the price we pay for civilization." . . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
