"Robert J. MacG. Dawson" <[EMAIL PROTECTED]> wrote:
> dennis roberts wrote:
>
>> At 12:56 PM 1/17/01 -0400, Robert J. MacG. Dawson wrote:
>>
>>> The testing example is not a stationary process,
>>
>> well, does this mean that NO testing example when there is a less than
>> perfect r between the two sets of "test" measures ... would qualify for
>> being a context in which to illustrate RTM?
>
>Not at all. If there's a perfect r you _won't_ see regression to the
mean! What it means
>is that not everything which expands or compresses the ends of a
distribution is RTM.
My understanding of "regression to the mean" is based mostly on a chapter
in Stigler's history, and sections from Freedman, Pisani, and Purves,
which is perhaps more accessible.
Stigler points out that Galton himself didn't seem (initially) to
understand all the implications of his method, in particular that the
regression effect is present in all of his models, so it's not hard to
understand why the effect is still prone to confusion.
Galton's original observation was based on his study of the heights of
sons and their fathers (though, as Dennis has pointed out, the data he
used were not quite that simple). Since the standard deviation of son's
heights and father's heights are *approximately* equal (thank goodness, or
else after ten thousand generations of human evolution we'd either all be
exactly the same height or some of us would be vanishingly short while
others of us would rival the redwoods; in fact, Galton did something even
more confusing, which was to average the heights of sons so their SD was
slightly smaller), it was easy for the aristocratic Galton to observe what
he thought was "regression to mediocrity," i.e., that *on average* the
progeny of "short" fathers were closer to the mean height than their
fathers were (but note that the sons were still shorter than average--they
were simply closer to average than their fathers), and that *on average*
the progeny of "tall" fathers were shorter than their fathers. Apparently,
Galton initially thought that this was evidence of some "natural" law,
akin to the natural laws discovered by his more famous cousin, Darwin.
Smart family. At some later time, Galton realized that: 1) there was no
"natural" law; 2) this characteristic was common to all analyses based on
his new method; and 3) the term "regression to mediocrity" was a bit, uh,
charged, and he substituted the more neutral sounding "regression to the
mean."
In fact, the regression effect is a characteristic of Galton's regression
method. A more precise statement of the regression effect would be, "the
predicted value of the dependent value (y hat) is closer to its mean (y
bar) in standard deviation units than x is to x bar in standard deviation
units." Another way to see this is that in the bivariate regression case,
the slope of the regression line, b, is related to the correlation
coefficient r(x,y) and the two standard deviations, sd(x) and sd(y) by
b = r(x,y) * sd(y)/sd(x)
Since r is bounded by -1 and 1, the regression line is *always* less steep
than a line with slope sd(y)/sd(x), which when it passes through (x bar, y
bar), is also known as the principal components line (in the bivariate
case, there's only one). Points along that line have the characteristic
that a 1 sd change in x is associated with a 1 sd change in y, so
-> any point on the PC line is the same distance from y bar in
y standard deviation units as it is from x bar in x sd units.
Since the regression line is *always* less steep than the PC line,
-> any point on the regression line has the characteristic that
its y component is closer to y bar (in y sd units) than x is
to x bar (in x sd units)
This is what is now meant (at least, by me) by "regression toward the
mean." We mean that y hat "appears to regress toward the mean of y
relative to x."
The "RTM effect" is discussed most often in test-retest situations mostly
because we notice it more, and we notice it more because the standard
deviations of the test and retest are approximately equal. But it's still
there, which is what Elliot Cramer was saying. If we regressed, say,
income on education, we'd observe the same effect, viz., that our
predicted income for a given level of education is closer to the mean
income (in standardized units) than the level of education is to its mean
(in standardized units). This appears to be especially true for those of
us with PhD's (budda bing!).
--Robert Chung
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================