In article <[EMAIL PROTECTED]>,
  [EMAIL PROTECTED] (dennis roberts) wrote:
> i went to some of the sites given in the urls ... and, quite frankly,
it is
> kind of difficult to really get a feel for what has transpired ... and
how
> targets were set ... and how goals were assessed
>
> regardless of whether we like this kind of an approach for
accountability
> ... or not ... we all have to admit that there are a host of problems
with
> it ... many of these are simply political and policy oriented (in
fact,
> these might be the largest of the problems ... when the legislature
starts
> enacting regulations without a real good understanding of the
> methodological problems)  ... some are measurement related ... and
yes,
> some are statistical in nature
>
> we do have components of this total process
>
> 1. there are tests that are developed/used/administered/scored ... in
4th
> and 8th and 10th grades ... these are NOT the same tests of course ...
so,
> one is never sure what it means to compare "results" from say the 8th
grade
> to the 4th grade ... etc.

In the rating guide, they state that they had core questions on the
1998, 1999, and 2000 tests.  The scaled score ranged from 200 to 280,
and different students getting the same number of correct responses on
these core questions would get the same scaled score on the 1998 to 2000
tests.
>
> 2. then we have the problem that one year ... we have the data on the
4th
> graders THAT year ... but, the next year we have data on the 4th
graders
> for THAT year ... these are not the same students ... so any direct
> comparison of the scores ... 4th to 4th ... or 8th to 8th or 10th to
10th
> ... are not totally comparable ... so, ANY difference in the scores
... up
> or down ... cannot be necessarily attributed to improvement or lack of
> improvement ... the changes could be related and in fact, totally
accounted
> for because there are small changes in the abilities of the 4th
graders one
> year compared to another year ... (or many other reasons)

Other reasons include changes in class size, or another factor.  The
1998 score was being compared to the mean of the 1999 and 2000 classes.
 So, if there is random variation around an unchanging mean or even a
moderate increase in the mean, a top performing 1998 school would likely
earn a failing grade in 2000 since they would be expected to show a 2
point increase over their 1998 score.

> 3. as i said before, we also have the problem of using aggregated
> performance ... either averages of schools and/or averages for
districts
> ... when we line them up and then assign these quality names of very
high,
> high, etc.
> there is a necessary disconnect between the real goals of education
...
> that is, helping individual kids learn .. and the way schools or
districts
> are being evaluated ... when averages are being used ...
>
> 4. i would like to know how on earth ... these 'standards' for
> "dictated"  improvement targets were derived ... did these have
anything to
> do with real data ... or an analysis of data ... or, were just
roundtabled
> and agreed to? we have to know this to be able to see if there is any
> connection between policy targets and statistical problems

I don't know how these decisions are reached.  On the local level, we
are seeing the consequences of this MCAS testing.  Teachers are being
forced to adjust their curriculum to cover the types of questions that
MCAS asked.  Despite that, good schools and teachers statewide are being
branded as failures because of this flawed evaluation process.
>
> 5. we have to try to separate relative standing data from actual test
score
> gain information ... and we don't know how or if the ones setting the
> standards and making decisions ... know anything about this problem
>
> so, to summarize ... there are many many issues and problems with
> implementing any system whereby you are trying to evaluate the
performance
> of schools and districts ... and, perhaps the least important of these
is a
> statistical one ... set in the context of political policy matters ...
that
> a legislature works with ... and "legislates" targets and practices
without
> really understanding the full gamut of difficulties when doing so
>
> unfortunately, in approaches like this, one makes an assumption that
if a
> school ... or district ... gets better (by whatever measure) ... that
this
> means that individual students get better too ... and we all know of
course
> that this is NOT NECESSARILY TRUE ... and in fact we know more than
that
> ... we know that it is NOT true ... in many cases
>
> sure, it is important to make some judgements about how schools and
> districts are doing ... especially if each gets large sums of money
from
> taxpayers ... but, the real issue is how we work with students ... and
how
> each and every one of them do ... how each kid improves or not ...
and, all
> these approaches to evaluating schools and districts ... fail to keep
that
> in mind ... thus, in the final analysis, all of these systems are
> fundamentally flawed ... (though they still may be useful)
>

I agree with you that schools and school districts should be
accountable.  I fear that this present evaluation system is not solving
the core problems.  It is not doing a good job of identifying and
rewarding top schools.  It may be unfairly flogging poor performing
schools.  From top to bottom, the MCAS evaluation appears to be giving
the public the feeling that their teachers are failing them.  I think a
valid statistical analysis must be done on these data.

There was an incredibly wrong-headed response by the Suffolk
University Beacon Hill Institute to this MCAS evaluation covered in both
the Boston Globe and Herald yesterday.  The report can be read at:

http://www.beaconhill.org/BHIStudies/EdStudyexecsum.pdf

These authors argued that the MCAS evaluation didn't do a good job
identifying effective schools.  They used a logit regression to account
for district-wide scores. One of their major conclusion is that in
top-performing school districts, large class size increases MCAS scores.

These are the 11 explanatory variables used to explain the 1998 MCAS
scores:

(A) Policy variables:
1) Increase in funding from 1994 to 1998
2) Change in student teacher ratio from 1994 to 1998
3) Number of students per computer
(B) Socioeconomic variables
4) Crime rate
5) % professionals in a district
6) % single parents
7) Dummy variable indicating urban vs. non-urban
(C) Choice variables
8) % students in Charter schools in district
9) % students bussed in by METCO
10) % students in public schools
(B) Previous tests
11) The district performance based on the 1994 statewide MEAPS test

My analysis:
This regression analysis is almost guaranteed to have severe
multicollinearity problems.  As such, one can not trust the magnitude
nor even the sign of the coefficients for these explanatory variables.
The authors proceed to discuss how their study indicates how the state
should change funding for schools:

a) Increases in funding doesn't lead to higher performance! (The authors
are proud that they didn't use actual funding per student as a variable)

b) REDUCING THE STUDENT TO TEACHER RATIO ACTUALLY WORSENED TEST
PERFORMANCE IN THE 8TH AND 10TH GRADE FOR HIGH-PERFORMING SCHOOLS

There are likely strong correlations between many of these explanatory
variables.  An analysis of the variance inflation factor for the
explantory variables would show this.  Due to this problem, the authors
should not trust the magnitude nor even the sign of the coefficients of
these variables.  I greatly doubt whether larger student-teacher ratios
leads to improved student performance in high performing schools as the
authors state

The authors base many of their policy recommendations on what are
probably invalid regression coefficients:
 *  no need to reduce class sizes in top schools
 *  don't increase funding for top-performing schools
 * "we can't improve performance by spending more."
 * the state should shift funding from high performing schools to low
performing schools, increasing the class size in high performing schools
and reducing class size in low performing schools will both increase
performance.

Is it really a surprise to anyone familiar with multicollinearity that
after you've included all 11 variables shown above that you don't see
much of an effect on "increase in funding" on MCAS scores.  Is it any
surprise that student-teacher ratio is negatively correlated with
performance after you've included the other 10 variables in the
equation.

This Beacon Hill Institute study shows how statistics are being abused
in this MCAS evaluation process.


--
Eugene D. Gallagher
ECOS, UMASS/Boston


Sent via Deja.com
http://www.deja.com/


=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
                  http://jse.stat.ncsu.edu/
=================================================================

Reply via email to