Re: differences between groups/treatments ?

2000-06-22 Thread Gene Gallagher

Rich Ulrich wrote:
 These are not quite equivalent options since the first one really
 stinks -- If you are considering drawing conclusions about causation,
 you need *random assignment* and the two Groups of performance are the
 furthest thing from random.

 Let's see:  the simple notion of regression-to-the-mean  says that the
 Best performers should fall back, the Worst performers should improve;
 that's a weird main-effect, which should wreak havoc with interpreting
 other effects.
 Or:  If the Pre is powerful enough to measure potential, then a
 continued-growth model says that Best performers should improve more,
 even given no treatments.

This pattern was described in an obit about two-three years ago in the
NY Times.  A statistician's obit noted that he'd found a flaw in the
Israeli air force's training program.  Apparently, the Israeli air force
was punishing the worst performers in a test because this usually
produced a better performance in subsequent tests and was supposedly
much more effective than positive reinforcement.  They'd found that
positive reinforcement of the best performers often resulted in a poorer
performance on the next test.  This now-deceased statistician pointed
out the confounding effect of regression to the mean on this assessement
of negative and positive reinforcement.  The effectiveness of negative
reinforcement (punishment) could be nothing more than a chance effect.

I wish I had the citation for the study or the obit.

Does anyone else in the group have a citation of this study?

--
Eugene D. Gallagher
ECOS, UMASS/Boston


Sent via Deja.com http://www.deja.com/
Before you buy.


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: differences between groups/treatments ?

2000-06-22 Thread Rich Strauss

At 04:31 PM 6/22/00 +, Gene Gallagher wrote:
This pattern was described in an obit about two-three years ago in the
NY Times.  A statistician's obit noted that he'd found a flaw in the
Israeli air force's training program.  Apparently, the Israeli air force
was punishing the worst performers in a test because this usually
produced a better performance in subsequent tests and was supposedly
much more effective than positive reinforcement.  They'd found that
positive reinforcement of the best performers often resulted in a poorer
performance on the next test.  This now-deceased statistician pointed
out the confounding effect of regression to the mean on this assessement
of negative and positive reinforcement.  The effectiveness of negative
reinforcement (punishment) could be nothing more than a chance effect.

A few years ago the journal "Statistical Methods in Medical Research"
published an issue on regression to the mean (vol 6, no 2, 1997).  It
included the five following papers:

Regression towards the mean, historically considered (pp. 103-114)
  M Stigler S. 

The impact and implication of regression to the mean on the design and
  analysis of medical investigations (115-128)
  Chuang-Stein C.,M Tong D. 

Adjusting for regression toward the mean when variables are normally
  distributed (129-146)
  Lin H., Hughes M. 

Non-normal variation and regression to the mean (147-166)
  Chesher A. 

Using regression models for prediction: shrinkage and regression to the
  mean (167-183)
  Copas J. 

Rich Strauss





Dr Richard E Strauss
Biological Sciences  
Texas Tech University   
Lubbock TX 79409-3131

Email: [EMAIL PROTECTED]
Phone: 806-742-2719
Fax: 806-742-2963 



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: differences between groups/treatments ?

2000-06-22 Thread dennis roberts

regression to the mean is not necessarily appropriate when looking at
pretest scores ... and then gain or improvement ... 

if we had parallel tests ... one for pre and one for post ... when nothing
happens inbetween ... then maybe so ... 

please see a short summary of this scenario ... applied to grading based on
gain or improvement ... at

http://roberts.ed.psu.edu/users/droberts/5501.htm

don burrill and i wrote a short paper on this ... 

those high on the pre CAN gain MORE than those low on the pre IF ... 

the correlation between pre and post is decent AND, most importantly ... if
the variance on the POST is LARGER than the variance on the pre

the exact reference to the paper is ...

Roberts and Burrill,
Spring 1995, Gain score grading revisited, Educational Measurement: Issues
and Practices, V14, #1



At 04:31 PM 6/22/00 +, you wrote:
Rich Ulrich wrote:
 These are not quite equivalent options since the first one really
 stinks -- If you are considering drawing conclusions about causation,
 you need *random assignment* and the two Groups of performance are the
 furthest thing from random.

 Let's see:  the simple notion of regression-to-the-mean  says that the
 Best performers should fall back, the Worst performers should improve;
 that's a weird main-effect, which should wreak havoc with interpreting
 other effects.
 Or:  If the Pre is powerful enough to measure potential, then a
 continued-growth model says that Best performers should improve more,
 even given no treatments.

This pattern was described in an obit about two-three years ago in the
NY Times.  A statistician's obit noted that he'd found a flaw in the
Israeli air force's training program.  Apparently, the Israeli air force
was punishing the worst performers in a test because this usually
produced a better performance in subsequent tests and was supposedly
much more effective than positive reinforcement.  They'd found that
positive reinforcement of the best performers often resulted in a poorer
performance on the next test.  This now-deceased statistician pointed
out the confounding effect of regression to the mean on this assessement
of negative and positive reinforcement.  The effectiveness of negative
reinforcement (punishment) could be nothing more than a chance effect.

I wish I had the citation for the study or the obit.

Does anyone else in the group have a citation of this study?

--
Eugene D. Gallagher
ECOS, UMASS/Boston


Sent via Deja.com http://www.deja.com/
Before you buy.


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===

==
dennis roberts, penn state university
educational psychology, 8148632401
http://roberts.ed.psu.edu/users/droberts/droberts.htm


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: differences between groups/treatments ?

2000-06-22 Thread lthayer

Look up the topic regression to the mean. This means that of values
measured several times , when extremes are revisited they can be at a
more typical value.
In article 8itf0t$a68$[EMAIL PROTECTED],
  Gene Gallagher [EMAIL PROTECTED] wrote:
 Rich Ulrich wrote:
  These are not quite equivalent options since the first one really
  stinks -- If you are considering drawing conclusions about
causation,
  you need *random assignment* and the two Groups of performance are
the
  furthest thing from random.
 
  Let's see:  the simple notion of regression-to-the-mean  says that
the
  Best performers should fall back, the Worst performers should
improve;
  that's a weird main-effect, which should wreak havoc with
interpreting
  other effects.
  Or:  If the Pre is powerful enough to measure potential, then a
  continued-growth model says that Best performers should improve
more,
  even given no treatments.

 This pattern was described in an obit about two-three years ago in the
 NY Times.  A statistician's obit noted that he'd found a flaw in the
 Israeli air force's training program.  Apparently, the Israeli air
force
 was punishing the worst performers in a test because this usually
 produced a better performance in subsequent tests and was supposedly
 much more effective than positive reinforcement.  They'd found that
 positive reinforcement of the best performers often resulted in a
poorer
 performance on the next test.  This now-deceased statistician pointed
 out the confounding effect of regression to the mean on this
assessement
 of negative and positive reinforcement.  The effectiveness of negative
 reinforcement (punishment) could be nothing more than a chance effect.

 I wish I had the citation for the study or the obit.

 Does anyone else in the group have a citation of this study?

 --
 Eugene D. Gallagher
 ECOS, UMASS/Boston

 Sent via Deja.com http://www.deja.com/
 Before you buy.



Sent via Deja.com http://www.deja.com/
Before you buy.


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: differences between groups/treatments ?

2000-06-20 Thread Rich Ulrich



On 19 Jun 2000 18:01:28 -0700, [EMAIL PROTECTED] (Dónal Murtagh) wrote:

  ...  
 Firstly, thank you for your comments. Am I right in saying that the two
 (equivalent) options I have are:

These are not quite equivalent options since the first one really
stinks -- If you are considering drawing conclusions about causation,
you need *random assignment* and the two Groups of performance are the
furthest thing from random.

Let's see:  the simple notion of regression-to-the-mean  says that the
Best performers should fall back, the Worst performers should improve;
that's a weird main-effect, which should wreak havoc with interpreting
other effects.
Or:  If the Pre is powerful enough to measure potential, then a
continued-growth model says that Best performers should improve more,
even given no treatments.  

For simple change-scores (and ANOVA interactions) from dichotomous
groups, you assume that neither of those possibilities are true, if
you want to be able to interpret them.

The Regression model at least places the contrasts into the realm 
of comparing the regression lines.  Your fundamental knowledge 
of what is happening will probably come from examining and comparing
the scatterplots, pre-post, for the two treatments.  (Another thing to
note from the picture:  Are there ceiling/basement effects on the
performance test?)

 1.ANOVA
 
 Yijk = mew + Ai + Bj + ABij + Eijk
 
 Ai:   a fixed factor representing the treatments (2 levels)
 Bj:   a fixed factor representing prior perfromance (2 levels)
 ABij: an interaction between Ai and Bj
 Yijk: the score of the kth child who received treatment i and is from group j
 Eijk: random error
 
 I suspect that this model is inapporpriate, as the Eijk term will represent
 between subjects (children) variation, which is not usually included in the
 estimate of random error.
 
 2.MLR
 
 Y = Bo + B1*X1 + B2*X2 + B3*X3 + E
 
 X1:   prior performance (0 = weak, 1 = strong)
 X2:   treatment (0 = treament A, 1 = treatment B)
 X3:   treatment*prior performance
 
 I appreciate that prior performance is probably better considered as a
 continuum, rather than a dichotomy.
 

 - Treating it as a continuum is better by a lot, even if you were
sure that the Performance scale
was close to the ANOVA-analytic ideal, a normal distribution.

-- 
Rich Ulrich, [EMAIL PROTECTED]
http://www.pitt.edu/~wpilib/index.html


===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: differences between groups/treatments ?

2000-06-20 Thread Donald Burrill

On Tue, 20 Jun 2000, Murtagh wrote:

 Firstly, thank you for your comments. Am I right in saying that the two
 (equivalent) options I have are:
 
 1.ANOVA
 
 Yijk = mew + Ai + Bj + ABij + Eijk
 
 Ai:   a fixed factor representing the treatments (2 levels)
 Bj:   a fixed factor representing prior perfromance (2 levels)
 ABij: an interaction between Ai and Bj
 Yijk: the score of the kth child who received treatment i and is from 
   group j 
 Eijk: random error
 
 I suspect that this model is inapporpriate, as the Eijk term will represent
 between subjects (children) variation, which is not usually included in the
 estimate of random error.

I do not understand this comment.  What source(s) of random error exist 
in this design APART from variation between subjects within cells?  
Between-subjects variation (as residuals from the model) defines the 
standard error-variance term against which the variability in the 
systematic effects is tested.
 
 2.MLR
 
 Y = Bo + B1*X1 + B2*X2 + B3*X3 + E
 
 X1:   prior performance (0 = weak, 1 = strong)
 X2:   treatment (0 = treament A, 1 = treatment B)
 X3:   treatment*prior performance
-- hence with the coding shown for X1 and X2,  1 = 
strong prior performance and treatment B, 0 = all other conditions.

And  E = Eijk of the ANOVA model.  B1 is a straightforward function 
(depending on the coding of X1, of course) of the Ai in the ANOVA model, 
B2 of the Bj (and depends on the coding of X2), and B3 of Ai, Bj, and 
ABij. 
 
 I appreciate that prior performance is probably better considered as a
 continuum, rather than a dichotomy.

_I_ would consider it so.  In fact, the first thing I'd do is ask for 
scatterplots of post-performance vs. pre-performance for all the cells 
in the design I was considering.  (In what you've described, that's two 
cells.)  THEN decide whether it appeared to make better sense to divide 
the continuum into two (or more) pieces, or to model it AS a continuum, 
possibly with non-linear functions.

  1.  If there are children of different sexes, you may be able to 
  consider a three-way design, although I suspect it would be 
  unbalanced, which (I also suspect!) may induce serious difficulties 
  for you.

 You mean that there would not be the same numbers in each group? 

Yes.

 I can't see why this should cause problems, but then that's probably 
 due to my relative ignorance of linear models!

Doesn't cause problems in one-way designs.  But in 2-way designs (let 
alone 3-way, 4-way, ...) unequal  n's  induce association of some kind 
between the design factors.  People who do multiple regression don't have 
much problem with this, it's their normal situation;  but people who try 
to do formal ANOVA design-of-experiments (and are therefore accustomed to 
the notion that the factors are mutually independent (and therefore are 
orthogonal)) are sometimes boggled by (1) the fact that the sums of 
squares for the several sources of variation do not simply add to the 
total sum of squares about the grand mean, or (2) the fact that the 
sums of squares reported depend on the order in which the factors are 
considered.  And many of the standard packages for doing multi-factor 
ANOVA use algorithms that require the design to be balanced. 
 (A GLM -- general linear model -- program does not usually have such 
constraints, and may even produce output patterned after the form of a 
standard balanced ANOVA, but one needs to be aware of (1) and (2) above.) 

  2.  Your Performance information you have chosen to dichotomize,
  although it is presumably (quasi-)continuous to start with.  You 
  might find out something useful by treating it as a continuous 
  predictor rather than as a dichotomy:  in effect carrying out an 
  analysis of covariance with pre-treatment reading score as the 
  covariate, whether you used an "Analysis of Covariance" program or 
  a "Multiple Regression" program or a "General Linear Model" (GLM) 
  program to do the arithmetic. 
 
 Presumably, this could achieved by simply using the pre-treatment score 
 itself (rather than 0 or 1) for the value of X1 in the suggested MLR 
 model above?
Right. 
 And if the pre-post relationship should turn out to be detectably 
nonlinear, you can substitute some candidate nonlinear function(s) of X1 
and see if that helps.

There may be nonlinearity to be EXPECTED:  in the nature of a reading 
test, there is a highest possible score (all items right, e.g.) and a 
lowest possible score (no items right, e.g.).  Students who perform well 
pre-treatment cannot have change scores that would put them above the 
highest possible score at post-treatment;  so it would not be surprising 
if (a) change correlates negatively with pre-treatment, (b) post scores 
were censored at the maximum (and negatively skewed), (c) pre scores were 
censored at the minimum (and positively skewed), and/or (d) the post vs. 
pre scatterplot showed 

Re: differences between groups/treatments ?

2000-06-20 Thread Donald Burrill

On Tue, 20 Jun 2000, Rich Ulrich wrote:

 On 19 Jun 2000 18:01:28 -0700, [EMAIL PROTECTED] (Dónal Murtagh) wrote:
 
   ...  
  Firstly, thank you for your comments. Am I right in saying that the two
  (equivalent) options I have are:
 
 These are not quite equivalent options since the first one really
 stinks --

Sorry, Rich, I must take issue with yoku.  If the first option really 
stinks, so does the second:  they are, in fact, equivalent, as Donal 
describes the second (with dichotomies for X1 and X2).

 If you are considering drawing conclusions about causation,

This is a fair enough warning, I suppose;  but I don't recall reading 
anything in the original post that implied that it was desired to show 
causation.  (Can't think of anything that expressly denied it either; 
but I still think you're reading it into, rather than out of, the 
problem.) 

 you need *random assignment* and the two Groups of performance are the
 furthest thing from random.

For that matter, had it been specified that the treatments were assigned 
at random?  In any case, I'd be interested in knowing how you would 
propose that performance might be assigned at random.  ;-)

 Let's see:  the simple notion of regression-to-the-mean  says that the 
 Best performers should fall back, the Worst performers should improve;
 that's a weird main-effect, which should wreak havoc with interpreting 
 other effects. 
 Or:  If the Pre is powerful enough to measure potential, then a
 continued-growth model says that Best performers should improve more,
 even given no treatments.  

Ummm...  I think you have to postulate that the POST is powerful enough, 
unless you're assuming that the Pre and Post measures are identical 
(which they may be, of course; though that introduces other measurement 
issues).

 For simple change-scores (and ANOVA interactions) from dichotomous
 groups, you assume that neither of those possibilities are true, if
 you want to be able to interpret them.

Only if you want to be able to interpret them SIMPLY.

 The Regression model at least places the contrasts into the realm 
 of comparing the regression lines. 
Yes, provided one is modelling 
the pretest as a continuum rather than as a coded dichotomy, as Donal 
described it.

 Your fundamental knowledge of what is happening will probably come 
 from examining and comparing the scatterplots, pre-post, for the two 
 treatments.  (Another thing to note from the picture:  Are there 
 ceiling/basement effects on the performance test?)

Good advice.  I concur.

  - Treating it as a continuum is better by a lot, even if you were
 sure that the Performance scale
 was close to the ANOVA-analytic ideal, a normal distribution.

Did you mean the ERRORS (or residuals) in the Performance scale, perhaps?
-- Don.
 
 Donald F. Burrill [EMAIL PROTECTED]
 348 Hyde Hall, Plymouth State College,  [EMAIL PROTECTED]
 MSC #29, Plymouth, NH 03264 603-535-2597
 184 Nashua Road, Bedford, NH 03110  603-471-7128  



===
This list is open to everyone.  Occasionally, less thoughtful
people send inappropriate messages.  Please DO NOT COMPLAIN TO
THE POSTMASTER about these messages because the postmaster has no
way of controlling them, and excessive complaints will result in
termination of the list.

For information about this list, including information about the
problem of inappropriate messages and information about how to
unsubscribe, please see the web page at
http://jse.stat.ncsu.edu/
===



Re: differences between groups/treatments ?

2000-06-19 Thread Donald Burrill

On Mon, 19 Jun 2000, Donal wrote:

 I'm currently analysing data resulting from a study of children's 
 reading ability. 

I shall resist the temptation to quibble over your inability to observe 
reading ability (as distinct from some indeterminate lower bound on that 
ability) ...

As you describe the study, you have an unspecified number of children 
divided into four groups in a two-way design of Treatments (2 levels) 
by Prior Performance (2 levels).  This would naturally lend itself to 
a two-way analysis of variance, or equivalently (pace Joe Ward) to a 
multiple regression analysis with three predictors:  Treatment, 
Performance, and Treatment*Performance.  If there are indeed effects 
attributable to Treatment and Performance, this analysis will be more 
sensitive to them than the two separate t-tests you propose.  And if 
there is an interaction between Treatment and Performance, as there may 
well be, the sensitivity to possible effects increases.

Whether this is the best analysis available is another question entirely. 

1.  If there are children of different sexes, you may be able to 
consider a three-way design, although I suspect it would be unbalanced, 
which (I also suspect!) may induce serious difficulties for you.

2.  Your Performance information you have chosen to dichotomize, 
although it is presumably (quasi-)continuous to start with.  You might 
find out something useful by treating it as a continuous predictor 
rather than as a dichotomy:  in effect carrying out an analysis of 
covariance with pre-treatment reading score as the covariate, whether you 
used an "Analysis of Covariance" program or a "Multiple Regression" 
program or a "General Linear Model" (GLM) program to do the arithmetic.

3.  In addition to sex, there may be other lurking variables in your data 
that could be used as predictors.  Whether it is sensible to consider 
including them in a hypothetical model depends partly on how many 
children you have all together, and partly on the distribution of any 
such candidate variable among _these_ children.

 The study involves two treatments and each child's reading ability was 
 measured before and after the application of one of the treatments.
 Thus, each child received one or the other (but not both) of two 
 possible treatments.  The children are divided into two groups:

Well, that's not quite true.  You chose to categorize them into two 
groups, but they could equally well have been divided into three, or 
four, or six (depending on the number of children available and one's 
degree of interest in fine-tuning the "Weak/Strong" dimension).
And if you have both boys and girls, you have two sexes as well, and 
it would not be surprising if they differed in their responses to the 
two treatments.  And how about the ages of the children?

 Weak readers: those whose pre-treatment reading score was less than 
 the mean pre-treatment reading score
 Strong readers: those whose pre-treatment reading score was greater 
 than the mean pre-treatment reading score

It is more usual, in situations like this, to divide at the median 
rather than the mean.  (For one thing, you're more likely to end up 
with groups of at least approximately equal size.)  Did you have a 
reason for using the mean?  Where did you put persons whose score 
was equal to the mean?

 Anyhow, I would like to test (for each treatment) whether or not the 
 change in reading score (Post-treatment score - Pre-treatment score) 
 is the same for weak readers and strong readers. I have attempted to 
 test this by:

 1. Creating a new variable, "Change"
  Change = Post-treatment score - Pre-treatment score
 
 2. Using a two-sample t-test to determine whether or not the mean 
 value of "Change" measured over the weak readers is significantly 
 different from the mean value of "Change" measured over the strong 
 readers.
 
 Similarly, I'd like to test whether or not the change in the reading 
 score is the same for each treatment. I have attempted to test this by:
 
 1. Creating a new variable, "Change"[as above]
 
 2. Using a two-sample t-test to determine whether or not the mean value 
 of "Change" measured over treatment A is significantly different from 
 the mean value of "Change" measured over treatment B

 However, I am not certain that this is the best way to test my 
 hypothesis, if anyone can suggest a better way, I'd be very grateful 
 for their assistance.

Do these in fact represent your hypotheses, or were they just the 
closest you thought you could get to what you really wanted to find out? 
E.g., are you REALLY only interested in the change scores, or are you 
(perhaps ALSO) interested in the level of proficiency attained, as 
measured (however imperfectly) by your post-test reading scores?
-- DFB.
 
 Donald F. Burrill [EMAIL