Hi

Warning ... simulation that may only be of interest to those "really" 
interested in question of restriction of range and correction.

A. Effect of Selection on SAT on SD of GPA

I initially set out to demonstrate that SD for GPA would shrink with selection 
on SAT, to verify one point of discussion in this thread.  Following SPSS 
program generates 100,000 SAT scores (Mu = 500, Sigma = 100) and GPAs (Mu = 
2.5, Sigma = .5) from a population with Rho = .71 (i.e., about 50% of GPA 
predicted by SAT).

I then selected successive samples on basis of SAT scores giving the following 
results.

Criterion   N   SD Sat   SD GPA   r
No select  100K  100      .50     .71
SAT>500   ~50K    59      .41     .52
SAT>600   ~16K    44      .38     .40
SAT>700   ~ 2K    33      .37     .31
SAT>800    132    22      .33     .28

SD SAT and the correlation both shrink, as expected, given selection was on 
SAT.  But the SD GPA also is reduced, albeit not as markedly as SD for SAT.

B. Accuracy of Correction for Restriction of Range

Given I had the datasets, I then decided to test out something I had asked Ken 
about, namely, how good was the correction for restriction of range.  I read in 
the values for SD SAT (i.e., SDx) and r given the different degrees of 
selection and then computed the standard correction for what is called direct 
restriction of range (which is what I did by selecting on SAT). The values are 
shown below in the columns headed rho# (I made a mistake initially and was so 
surprised at the weird results I kept looking for and implementing various 
versions of the formula, as shown below in the SPSS commands, hence the 
multiple identical values, which are no longer weird after I corrected my 
mistake).

To illustrate, the third row selects cases above 700 (2 SDs above the mean), 
which amounts to about 16% of the 100,000 scores.  The sample r of .4 produces 
a population rho of .70, quite close to the actual value.  With extreme 
selection (bottom row, SAT > 800, .13% of scores), the formula appears to 
overcorrect.

   sdx      r   sigx  sigx2   rho1   rho2   rho3   rho4   rho5
100.00  .7100 100.00  10000  .7100  .7100  .7100  .7100  .7100
59.000  .5200 100.00  10000  .7181  .7181  .7181  .7181  .7181
44.000  .4000 100.00  10000  .7042  .7042  .7042  .7042  .7042
33.000  .3100 100.00  10000  .7029  .7029  .7029  .7029  .7029
22.000  .2800 100.00  10000  .7984  .7984  .7984  .7984  .7984

The articles I briefly looked at to obtain the various formula certainly made 
clear that the entire issue is far more complicated than I had appreciated 
before this discussion. Factors considered include such things as sampling 
ratio, shape of distributions, and underlying basis for selection.  A recent 
article, for example, observed that the standard correction that I used 
generally UNDERCORRECTS for what is called indirect restriction of range. This 
involves selection on the basis of some third variable related to X and Y, and 
is thought to characterize many selection situations.  That is, actual 
selection is seldom JUST on the predictor being adjusted.  The authors 
speculate that many current findings might need to be reconsidered and might 
actually be stronger than previously thought.  See Schmidt et al, 2006, 
Personnel Psychology.

Take care
Jim

The SPSS programs appear below.

input program.
loop o = 1 to 100000.
comp sat = rv.norm(0,1).
comp gpa = rv.norm(0,1)*.7071 + sat*.7071.
end case.
end loop.
end file.
end input program.
comp sat = rnd(500 + sat*100).
comp gpa = 2.5 + gpa*.5.

corr gpa sat /stat.
     Mean       Std. Deviation N      
 gpa 2.498497   .4986521       100000 
 sat 499.736190 99.6350487     100000 

                     gpa    sat    
 gpa Pearson         1      .706   

temp.
select if sat > 500.
corr gpa sat /stat.
     Mean       Std. Deviation N     
 gpa 2.779867   .4127305       49824 
 sat 579.504335 59.7406878     49824 

                     gpa   sat   
 gpa Pearson         1     .516  

temp.
select if sat > 600.
corr gpa sat /stat.
     Mean       Std. Deviation N     
 gpa 3.040882   .3842720       15578 
 sat 652.300488 44.1317182     15578 

                     gpa   sat   
 gpa Pearson         1     .400  

temp.
select if sat > 700.
corr gpa sat /stat.
     Mean       Std. Deviation N    
 gpa 3.334208   .3693546       2184 
 sat 737.050824 33.3068533     2184 

                     gpa  sat  
 gpa Pearson         1    .309 

temp.
select if sat > 800.
corr gpa sat /stat.
     Mean       Std. Deviation N   
 gpa 3.632516   .3280184       132 
 sat 824.863636 21.9338954     132 

                     gpa  sat  
 gpa Pearson         1    .279 

*based on n = 100k.
data list free / sdx r.
begin data
100     .71
59      .52
44      .40
33      .31
22      .28
end data.
comp sigx = 100.
comp sigx2 = sigx**2.
comp rho1 = (sigx*r)/sqrt(sdx**2*(1-r**2)+sigx2*r**2).
comp rho2 = (r*sigx/sdx)/sqrt(1 - r**2 + (r**2)*(sigx/sdx)**2).
comp rho3 = (sigx/sdx)*r/sqrt(((sigx/sdx)**2)*r**2 - r**2 + 1).
comp rho4 = r/sqrt(r**2 + (1-r**2)*sdx**2/sigx2).
comp rho5 = ((1/(sdx/sigx))*r)/sqrt((1/(sdx/sigx)**2 - 1)*r**2 + 1).
list.

   sdx      r   sigx  sigx2   rho1   rho2   rho3   rho4   rho5
100.00  .7100 100.00  10000  .7100  .7100  .7100  .7100  .7100
59.000  .5200 100.00  10000  .7181  .7181  .7181  .7181  .7181
44.000  .4000 100.00  10000  .7042  .7042  .7042  .7042  .7042
33.000  .3100 100.00  10000  .7029  .7029  .7029  .7029  .7029
22.000  .2800 100.00  10000  .7984  .7984  .7984  .7984  .7984


James M. Clark
Professor of Psychology
204-786-9757
204-774-4134 Fax
[EMAIL PROTECTED]

---
To make changes to your subscription contact:

Bill Southerly ([EMAIL PROTECTED])

Reply via email to