On Wed, 10 Jan 2001 21:32:43 GMT, Gene Gallagher
<[EMAIL PROTECTED]> wrote:

> The Massachusetts Dept. of Education committed what appears to be a
> howling statistical blunder yesterday.  It would be funny if not for the
> millions of dollars, thousands of hours of work, and thousands of
> students' lives that could be affected.
> 
< snip, much detail > 
> I find this really disturbing.  I am not a big fan of standardized
> testing, but if the state is going to spend millions of dollars
> implementing a state-wide testing program, then the evaluation process
> must be statistically valid.  This evaluation plan, falling prey to the
> regression fallacy, could not have been reviewed by a competent
> statistician.
> 
> I hate to be completely negative about this.  I'm assuming that
> psychologists and others involved in repeated testing must have
> solutions to this test-retest problem.

The proper starting point for a comparison for a school
should be the estimate of the "true score" for the school:
the regressed-predicted value under circumstances of
no-change.  "No-change" at the bottom would be satisfied
by becoming only a little better; no-change at the top 
would be met by becoming only a little worse.  If you are 
dumping money into the whole system, then you might 
hope to (expect to?) bias the changes into a positive direction.

I thought it was curious that the 
      "schools in the highest two categories were expected to increase
their average MCAS scores by 1 to 2 points, while schools in the
lowest two categories were expected to improve their scores by 4-7
points."  

That sounds rational in form.  It appears to me that their model 
might have the correct form, but the numbers surprised them.
That is:  It looks as if someone was taking into account regression of
a couple of points, then hoping for a gain of 4 or 5 points.  That
(probably) under-estimated the regression-to-the-mean, and
over-estimated how much a school could achieve by freshly 
swearing to good intentions.

What is needed -- in addition to self-serving excuses -- is an
external source of validation.  And it should validate in cases that
are not predicted by regression to the mean.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to