On Tue, 23 Jan 2001 18:51:25 GMT, Gene Gallagher
<[EMAIL PROTECTED]> wrote:

 < snip >
> The DOE's latest description of the conversion of raw points to scaled
> scores is for the 1999 exam.  These reports were issued Fall 2000.  The
> appendices of this 52 page pdf to the schools gives the scaling for the
> 1999 exam:
> http://www.doe.mass.edu/mcas/99results/interp99/full_guide.pdf
> 
 < ... > 
> In other DOE documents it states that a score over 280 or less than 200
> is possible on some tests in some years, but these extremes will be
> converted to 200 or 280.
> 
> In the following technical report on the 199 exam, they provide examples
> of how to convert raw scores (e.g., 1 point for getting one correct
> answer) to the 200 to 280 scale:
> 
> http://www.doe.mass.edu/mcas/2000docs/pdf/99techrep.pdf
> 
> There is something really odd about how these scaling equations are
> applied.  Check out page 70 of this 112 page pdf.  It is page 66 using
> the reports pagination.  Two regression equations are provided for
> converting raw scores, r, to scaled scores S
>  S=1.59r+177.25 if r<38.83
> 
>  S=2.65r+136.19 if r>38.83
> 
> Now, the 38.83 is the raw score that a panel had decided was the cutoff
> for proficiency.  So, if you are proficient, every point you score adds
> 2.65 points to your scaled score.  If the student misses the proficient
> cutoff of 38.83, the raw correct answers are only worth 1.59 points.
> 
> 
> Presumably, the reason for these odd scalings is the need to keep scores
> comparable from year-to-year, but I can't justify such odd conversions.

 - I want to point to a couple of details.  
In the equation above, if  raw-score, number correct, 
r=0, then  S = 177.   
S reaches the minimum at 200, only when r=14 or so.  

I assume that this is intentional, and that the designers are
following the convention that the pupils scores are set to the 
minimum (200) if they don't score above chance.  They are truncating
the bottom performances.

>From the formula, the raw score of 39 produces S of 239, which is
about the mid-point of their scale.  It appears that they want to use
15 items to stretch out the top scoring.  I suppose they could try to
do that by assigning more points to the toughest items -- but that
would make it harder to account for the "chance" amount in the way
that they are doing right now.  Instead, it appears that they have
used this two-part scoring in order to create a distribution of the
shape that they desire.

This is a little odd, but I am not bothered by this nearly as much as
Gene seems to be.  For a really close study of the schools, one might
want to know how many students scored above C1, C2, C3 ... cutoffs.  
But it would be highly unusual for a school to be poorly represented
by its mean, such that this scoring could make a difference.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to