On Fri, 30 May 2008 10:16:30 -0700, Ken Steele wrote:
>Mike Palij wrote:
>>On Thu, 29 May 2008 12:50:40 -0700, Ken Steele writes:
>>>A review article of these issues has appeared in timely fashion-
>>>Sackett, P. R., Borneman, M. J., & Connelly, B. J. (2008).
>>>High-stakes testing in higher education and employment:
>>>Appraising the evidence for validity and fairness. American
>>>Psychologist, 63 (4), 215-227.
[snip]
Ken wrote:
>I agree with Mike that this is an issue that is poorly understood. I
>just created a data set for a colleague to simulate this effect.
>The set was a simulation of the relationship between SAT and HS GPA.
>The original set began with N = 100 and r = 0.8. I started reducing the
>set size based on hypothetical college admission admission cutoffs. By
>the last slice (OldIvy standards, SAT > 1399, N = 6), r = 0.19.

First, I want to thank Ken for trying to make the situation clearer.
It also motivated me reflect on what I had written (was it a psychotic
break with reality or not?) and to check on a couple of references.
As it turns out, the problem of range restriction and the situation
that I described is presented in the Glass & Hopkins (1996; 3rd
ed) statistics text on pages 121-123 (for the 2nd ed, see pp92-93).
Indeed , the 3rd editions goes through the calculations for an
example involving the SAT and GPA on pages 122-123.

Second, even so, I think there are still problems. As G&H state:
"the principal value of Equation 7.5 [the equation correcting for
restriction of range] is *not* for the purpose of estimating rho
[the population correlation in the unrestricted sample].  The 3rd
ed is not clear on this point but the 2nd ed refers to a paper by
Guilliskson and Hopkins (1976) for more information about the
accuracy of the corrected correlation coefficient.  I found the
Gullickson & Hopkins 1976 article and they point out that it was
Pearson 1903 who attempted to come up with a solution to the
restriction of range problem.  They make the following point:

|Appropriate use of the estimator R [the corrected estimate of
|rho] is hampered by an absence of knowledge of its properties,
|i.e., bias, consistency, and efficiency.  Of particular concern is the
|probable change in its efficiency, i.e., the precision with with it
|estimates rho, as the explicit seletcion of the X variable is changed.

They go on to explain why identifiying the properties of the
corrected estimate R is difficult (it depends upong three parameters
which are interrelated).  The rest of artcle is devoted to ways for
defining an interval for rho = zero and R's relationship to it.

Now, this suggests that there are problems with the use of the
correlation coefficient corrected for restriction or range which
appear to be glossed over (such as Glass & Hopkins 1996).
A quick search through PsycInfo for articles on restriction of
range and articles that cite Gullickson & Hopkins don't point to
any new development that eliminate the concens that Gullickson
& Hopkins raise.  If there is, I'd like to get the reference.

However, as interesting as the above stuff is, it actually isn't
the original point I was focusing on:

Mike Palij had wrote:
>>(2)  Imagine that we have two variables, assume that both
>>might be normally distributed and that we can calculate a
>>correlation between the two variables.  The problem is
>>variable X (say SAT scores) has a peculiar relationship to
>>variable Y (say GPA scores) namely
>>Prob (Y | X greater than a critical value) > 0 and
>>Prob (Y }X less than a critical value) = 0.

Let me reiterate and be explicit about this situation.
If one examine Figure 7.11 in Glass & Hopkins 3rd ed (p121)
we have a scatterplot with both X and Y being normal
distributions.  The scatterplot shows that across the full
range of X,Y values, we have an apparent correlation .  However,
if we only use values above a specific X value (e.g., SAT
scores equal to or above 1200) then the scatterplot is restricted
to only X,Y values above the cutoff (>1200).  This is Case (1)
I presented in my earlier email.

In Case 2 above there is restriction of range in X but not in Y.
Presumably our theoretical college doesn't suffer from really
bad grade inflation, so its GPA values will range from 0.00 (F)
to 4.00 (A).  Extending the range of SAT scores below 1200
will not produce a corresponding increase in the range or variance
of GPA values.

The correction Ken Steele refers to below is demonstrated by
Glass & Hopkins, but with standard deviation for SAT in the
restricted sample represented by 50 points and the  standard
deviation in the unrestricted sample = 100 points.  But this seems
to assume that will be a corresponding increase in the Y (GPA)
variable but that's impossible (unless everyone is getting As and
Bs).  This suggests that there is heterogeneity of variance in X and Y and
I believe that the usual correction assumes homogeneity of variance.
I need to track down a book by Thurstone on personnel selection
which may deal with this situation.

>This comment is based on my reading of Rosenthal & Rosnow's
>"Essentials" (2008, p. 349-353). Here is how they explain Pearson's
>correction procedure for the change in r using the example of some
>hypothetical academic skills test and the effect on r going from HS GPA
>to the restricted group of 1st year college GPA.
>
>The essence of the correction/modification is that you need variability
>measures from both the full group and the restricted-range group.
>One r is converted to the other r by weighting the change in variability.
>
>I would presume that ETS has access to data such that it has a good
>estimate of variabilities of SAT scores for HS students.

If not, they could always look at Figure 6.5 in Glass & Hopkins (3rd ed).
But, of course, ETS calculates the parameters for each administration of
the SAT and know very well what the disributional properties are of
the SAT scores

>So the Sackett et al. correction does not seem as spooky as it seemed
>on my first read.

Actually, parts of it seem even spookier now.  And don't get me started
on why they shouldn't have left out Age of English Acquistion (AEA) when
looking at SAT-GPA correlations (AEA has a significant negative correlation
with SAT scores).

[snip Mike Palij's comments on Snedecor & Cochran on ANCOVA]
>I happened to have a copy of S&C. I think the example you are thinking
>of is on p. 431-432 which looks at an ANCOVA involving school
>expenditures in 5 states adjusted for per capita income. Their point is
>that if the per capita income were increased in the poorer states then t
>hat increase would not necessarily be apportioned to education.

My copy is still MIA but my memory is not that this is not the example
or the main point.  I'll have to do some more checking but thanks for
looking.

-Mike Palij
New York University
[EMAIL PROTECTED]

Glass, G. V., & Hopkins, K. D. (1996). Statistical methods in education
and psychology (3rd ed.). Needham Heights, MA, US: Allyn & Bacon.

Gullickson, A., & Hopkins, K. (1976). Interval estimation of correlation
coefficients corrected for restriction of range. Educational and
Psychological
Measurement, 36(1), 9-25. doi:10.1177/001316447603600102



---
To make changes to your subscription contact:

Bill Southerly ([EMAIL PROTECTED])

Reply via email to