Dennis,

Not sure how this is/could be done in education, but in production, I'd say that
there are sources of variation available - (presumed) ability of the student,
which results in different performance on the test, and variation due to grader,
for two.  One could make Box plots for each sub-group (defined by grader).  Now,
if any subgroup is well away from the others, we can conclude that either (a) the
grader was an 'outlier,' or (b) the subgroup was not similar to the others.

How far is 'well away'?  more than 2 pooled stdev's could be a good indicator.  A
one way AoV test might do the trick, too.  The chances of getting a subgroup 2
stdev's away from the total mean, by shear luck of the draw, is pretty darn small
- about 5%, 1 in 20.

Not to argue the ability of students to inadvertently sort themselves into
weirdly deviate groups, but if I saw one subgroup 2 stdevs away from the overall
mean, I would go exploring how the grader did their thing.

Jay

Dennis Roberts wrote:

> At 08:57 PM 4/7/02 +0000, Tristan Miller wrote:
> >Greetings.
> >
> >On Sun, 7 Apr 2002, Glen Barnett wrote:
> > > Assuming you *can* take average student abilities across classes as equal
> >
> >Who said that we are sampling across classes?  I was thinking of the case
> >where the assignments from a single large class are randomly divided among
> >several graders for marking, and one of the graders is an outlier.
>
> say ... you have (just as an example) 50 examinees ... each turning in an
> assignment ... and, randomly assigning them to 5 graders ... 10 assignments
> each ... right?
>
> how will you know for sure IF a grader is aberrant? ... an outlier? ...
> surely, across the graders, there will be mean differences in their
> gradings ... so, how much is now "defined" as too much?
>
> if we make some assumption that IF this person is aberrant ...that is a
> random aberration ... then some linear adjustment might be called for or
> justified but, if that is not the case ... some peculiar way in which this
> grader rates things ... either very high or low ... if he/she does it in
> some strange way DEPENDING on the specific content said by the examinees
> ... then i don't see that such an across the board adjustment can be justified
>
> > > there are a variety of ways you might match mean and s.d.,
>
> matching by mean and sd ... does not solve the potential problem that the
> ORDERings of the examinees may be different FOR that set of examinee papers
> COMPARED to how other graders might have rated these assignments ...
>
> >=================================================================
> >Instructions for joining and leaving this list, remarks about the
> >problem of INAPPROPRIATE MESSAGES, and archives are available at:
> >.                  http://jse.stat.ncsu.edu/                    .
> >=================================================================
>
> .
> .
> =================================================================
> Instructions for joining and leaving this list, remarks about the
> problem of INAPPROPRIATE MESSAGES, and archives are available at:
> .                  http://jse.stat.ncsu.edu/                    .
> =================================================================

--
Jay Warner
Principal Scientist
Warner Consulting, Inc.
4444 North Green Bay Road
Racine, WI 53404-1216
USA

Ph: (262) 634-9100
FAX: (262) 681-1133
email: [EMAIL PROTECTED]
web: http://www.a2q.com

The A2Q Method (tm) -- What do you want to improve today?




.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to