Mike's pessimistic view of grades and tests does not jibe with my 
experience.  Just a couple of observations.

1.  Whenever I have examined Cronbach's alpha for my multiple-choice or 
short-answer tests I invariably find quite respectable values.  So such tests 
are 
certainly not a lottery in the sense that students' answers to each question 
are like a coin toss.

>Grades are the criterion, not the predictor.   One of the odd aspects of 
this entire area of research is that grades are considered perfectly reliable 
and 
valid.   As a measurement device, they are not held to the same standard of 
psychometrics as the SAT.   Once you ponder the unknown reliability and 
validity of grades, it is likely they should NOT correlate with anything, 
including 
the SAT.   Your example is wonderful because it highlites how rarely we examine 
the reliability of our tests.   These tests are the basis for grades.   All 
the error in the tests accumulates in the grades.   In addition, if you assign 
grades based on factors that are not related to test performance, such as 
attendance, time taken to complete an assignment etc., then you have introduced 
variance that has nothing to do with competence.   Why would any reasonable 
person ever predict that the SAT should correlate with such a score?

2.  My independently scored multiple-choice and short-answer marks invariably 
correlate with one another, albeit far from perfectly, of course.  But 
unreliable measures should not (can not?) show such consistency with one 
another.

>Whatever reliability they have will constrain relationships with other 
scores.

3.  If individual assessments were simply noise, then all students would end 
up with the same final average, especially in courses with numerous 
assessments.  And across all courses, their gpas would be about the same.  This 
is 
clearly not the case.  Indeed a problem in many courses is exactly the opposite 
... 
a bimodal distribution.  And a corresponding problem at the aggregate level 
is students who cannot maintain an adequate gpa.

>This is a straw man.   I never claimed tests were only noise.   If this was 
the case then we would not use them.    Your distributions are likely more 
skewed than bimodal.   If your students are all working hard then they will 
have 
a similar performance on your tests.   Since you can't give everyone an A, you 
have probably designed your tests to enforce a normal curve.   There are 
various ways you may have done this.  A normal curve should not represent 
school 
grades if they are valid.   If you actually design a competency-based course 
then everyone should demonstrate competence and get an A.   One of the great 
problems with American education is the dominance of grading systems that 
enforce 
a normal curve and not competence.   It explains how we can have a nation of 
nonreaders.   Nonreaders get a C and pass along when they should get an F.   
When they become competent at reading, they should get an A.   

4.  My admittedly subjective judgement of the students I get to know best 
(i.e., honours students) is that the excellent students are clearly superior to 
students with lower grades, even when the difference is marginal (e.g., A+ vs. 
A vs. A-).  The papers, presentations, whatever of the top students are just 
superior.  And the fact that they stand out in class after class again 
indicates the consistency of this judgment across courses and faculty.  I would 
be 
very surprised if a blind marking of essays by students of different grade 
levels 
did not provide validation of the grades.

>This is availabilty bias.   You stated above that you don't know the 
nonhonors students.   If you only read the papers of honors students then you 
will 
develop this biased view of superiority.   

5.  Restriction of range is clearly a problem in evaluating predictors at the 
university level, especially at selective institutions.  A colleague was 
talking at lunch today about the French system.  University is free and very 
many 
students attend.  He also observed that very many tend to drop out in the 
first few years (he mentioned 80% ... perhaps this is the model described by 
Chris 
in effect).  I would bet a fair amount that a French equivalent of the SAT 
would be highly predictive of who would drop out.

>Of the few studies that have dealt with restricted range, the predictive 
power of both High School GPA/Rank and SATs increase.   However, the difference 
between them stays the same (approx .1).   The predictive power added by SAT 
does not justify the cost and trouble.

6.  Even accepting the modest existing correlations, however, caution is 
needed.  While it is true that a small correlation can be significant given a 
large enough n, it is not true that a small correlation (effect size) is 
necessarily unimportant.  The classic example is the aspirin study ... a 
minuscule 
effect translated into many lives saved because of the huge numbers involved.  
Similarly huge numbers are involved when it comes to universities as well, and 
(like aspirin) the cost of the test is low relative to the cost of a year of 
university, both for the institution and for the student.

>What can I say, a small effect size is a small effect size.   The use of the 
SAT produces a giant effect size in the life of the students.   

I would be very interested in evidence that grades or objective tests of 
aptitude/ability/achievement are "like a lottery."

>The acceptance process is the lottery.   If my best prediction is approx .5 
then I am operating like a card counter at the Blackjack table.

Mike Williams
http://www.learnpsychology.com


**************
Get trade secrets for amazing 
burgers. Watch "Cooking with Tyler Florence" on AOL Food.
      
(http://food.aol.com/tyler-florence?video=4&?NCID=aolfod00030000000002)

---
To make changes to your subscription contact:

Bill Southerly ([EMAIL PROTECTED])

Reply via email to