it seems to me that the notion of a confidence interval is a general
concept ... having to do with estimating some unknown quantity in which
errors are known to occur or be present in that estimation process
in general, the generic version of a CI is:
statistic/estimator +/- (multiplier) * error
the multiplier will drive the amount of confidence and, error will be
estimated by varying processes depending upon the parameter or thing you
are estimating
what we might want to estimate in a regression setting is what one
particular person might do on a future outcome variable, like college gpa,
given that we know what THAT person has achieved on some current variable
(high school gpa) ... if we are interested in this specific person, then
error will be estimated by some function of HIS/HER variation and that will
be factored into the above generic equation as error ... this would be what
jon cryer rightfully called a prediction interval ... BUT, it still fits
within the realm of the concept of a CI
in other regression cases, we might not be interested in estimation for one
specific individual on the criterion given that individual's score on the
current variable, but rather what is the expected MEAN criterion value for
a group of people who all got the same current variable value ... in this
case, error is estimated by some function of the group on the current
variable ... and this is what in regression terms is called a confidence
band or interval ... but, the concept itself is no different than the
prediction interval ... what IS different is what is considered error and
how we estimate it
when we use a sample mean to estimate some population mean, we have the
same identical general problem ... since we use the sample mean as the
estimator and, we have a way of conceptualizing and estimating error
(sampling error of the mean) in that case BUT, we still use the generic
formula above ... to build our CI
in all of these cases, there is a concept of what error is and, some method
by which we estimate it and, in all these cases we use some given quantity
(statistic/estimator) to take a stab at an unknown quantity (parameter/true
criterion) .... and we use the estimated error around the known quantity as
a fudge factor, tolerance factor, a margin of error factor ... when making
our estimate of the unknown quantity of interest
all of these represent the same basic idea ... only the details of what is
used as the point estimate and what is used as the estimate of ERROR of the
point estimate ... change
also, in all of these cases whether it be in regression work or sampling
error (of means for example) work ... we still attach a quantity ... a
percentage value ... to the intervals like we have created when estimating
the unknown and, as far as i can tell, we interpret that percentage in the
same identical way in all of these cases ... with respect to the long run
average number or percentage of "hits" that our intervals have of capturing
the true value (parameter or true criterion value)
i am more than willing to use different terms to differentiate amongst
these different settings ... such as in regression when you are inferring
something about an individual ... or a group of individuals (though even
here, i think we could select better differentiators than we currently use
... like personal interval versus group interval) ... but overall, all of
these are variations of the same notion and fundamental idea
IMHO of course
_________________________________________________________
dennis roberts, educational psychology, penn state university
208 cedar, AC 8148632401, mailto:[EMAIL PROTECTED]
http://roberts.ed.psu.edu/users/droberts/drober~1.htm
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================