Re: Mathematica's Report on After-School Learning Programs; also

Donald Burrill Sat, 14 Jun 2003 03:26:47 -0700

[I have taken the liberty of copying this to the edstat list.  -- DFB.]

Hi, Steve.  Comments embedded in your post (below).

On Sat, 14 Jun 2003, Steve Kramer wrote:

> Professor Hake (and others),
>
> I have two questions.
>
> QUESTION ONE: regarding normalized gain.
>  I've noted before that I can see the advantage of <g> or "normalized
> gain" as you have defined it for reporting meaningfully across many
> settings.  But in context of a research sudy like the one completed
> by Mathematica, wouldn't the ideal approach be to get pretest data
> (as you have advocated) but then to use Analysis of Covariance
> (ANCOVA), instead of using "normalized gain" as a dependant
> variable?  (Note you argue that normalized gain is "IS A MUCH BETTER
> INDICATOR OF THE EXTENT TO WHICH A TREATMENT IS EFFECTIVE THAN IS
> EITHER GAIN OR POSTTEST"--and I agree with you--but my impression is
> that ANCOVA is better still.  Do you have data I don't have that
> would indicate my impression is incorrect?)

Umm... You correctly describe <g> as a way of (re)defining one's
dependent variable, which otherwise would be either posttest or simple
gain (posttest minus pretest), presumably.  What precisely is it that
you expect ANCOVA to do for you?  So far as I can see, the only two
models on offer (in the usual ANCOVA context, excluding interaction
between pretest and treatment, which exclusion I consider unreasonable)
are

 I.   Y = b0 + b1*X + b2*T   or
 II.  G = c0 + c1*X + c2*T

where X = pretest, Y = posttest, G = Gain = (Y-X), and
 T = indicator variable for treatment.

Evidently c1 = b1 - 1, c2 = b2, c0 = (I think) b0 - <mean of X>.
The measure of effectiveness of treatment in either I or II is the
coefficient b2 (or c2:  same thing).  This is perhaps not unreasonable
if the treatment effect can be believed to be constant over the range
of the pretest.  But if the range of the pretest is large enough that
one would feels constrained to use ANCOVA in the first place, it is
almost surely large enough to deny any such constancy:  one expects
ceiling effects (in either Y or G) at high values of X for the upper
regression line (the treatment, presumably), and floor effects may not
be unexpected at low values of X for the lower line (although this point
may be disputed, depending in part on context).  It follows that one
cannot credibly assume the slopes of the two regression lines (one for
treatment, one for control) to be either equal or constant over the full
range of the pretest.
  By about this point one is reminded of George Box's dictum, and begins
to suspect that this (standard ANCOVA:  I or II above) is not one of the
useful models.

> QUESTION TWO:  regarding the Mathematica study and missing covariate
> data. I'd think, based on what your post described, that there would
> be enough data available in the Mathematica data-set to utilize
> Multiple Imputation for the missing covariate data, and then to run
> a legitimate quasi-experiment with the multiply imputed data,
> adjusting the confidence intervals using methods described by Rubin.
> Am I overly enamored of the (to me) new technique of Multiple
> Imputation, or am I correct that we now have the statistical tools
> needed to handle a problem like the one encountered by Mathematica?

Imputation (multiple or otherwise) can only "work" if the variables
available (from which to impute the missing information) include the
salient variables.  This necessarily involves a considerable leap of
faith.  I used to do a little demonstration to show why one should not
use the original, simplest, method of imputation (substituting the mean
value of the missing variable), because that method generated what one
might call "the cruciform fallacy".  Suppose you have two variables (X
and Y), with some missing data on each variable scattered throughout the
range of the non-missing variable.  If X and Y are somewhat correlated,
a scatterplot might look like the figure below, in which * displays an
observed datum (both X and Y observed), . displays the X (or Y) value
of a datum for which Y (or X) is missing, and + displays the imputed
datum point (substituting the mean of X (or of Y) for the missing value.

   Y   +
       -.                     +
       -.                     +               *
       -.                     +       *     *
       -                         *         *    *
       +.                   * +   *           *
       -.                  *  +  *     *    *
       -     +    +      *+  *  + *   +*     +   +
       -.              *      + *
       -            *   *   *
       +.     *  *     *      +
       -.     *    *          +
       -.  *   *              +
       -     .    .       .     .     .      .   .
         --+---------+---------+---------+---------+-- X

As you can see, while the means of X and Y may be more or less unchanged
by this imputation, their variances are reduced, and the correlation
between them is made to appear much weaker than it "really" is.
 (You can perhaps also see why I call it a "cruciform" fallacy.)

Imputation by regression is perhaps superior, but one suspects that its
defects are merely more subtle of ilustration.  (In the two-dimensional
case, the means would remain more or less unchanged, the variances would
still be reduced (but by rather less than in the case shown above), and
the correlation coefficient would be INcreased (because of all those
extra points now ON the regression line).)

As you may have inferred, I am not particularly enamored of methods of
multiple imputation.  The worst sticking point, I think, is the saliency
of the variables used for imputation, and one's natural tendency to
suppose (given semi-automatic methods of computation) that that's all
been taken care of.  For one simple example, suppose a pretest/posttest
situation where the posttest (say) is missing AND so is the value of T:
that is, whether the observation were in the treatment or control group.
Then you're guaranteed to get nonsense, sine the imputation machinery
will have no choice but to estimate the posttest value in the middle of
the space between the treatment/control regression lines (that is,
assuming that there really is a treatment effect).

 -- Don Burrill.
 -----------------------------------------------------------------------
 Donald F. Burrill                                         [EMAIL PROTECTED]
 56 Sebbins Pond Drive, Bedford, NH 03110                 (603) 626-0816

.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Re: Mathematica's Report on After-School Learning Programs; also

Reply via email to