Thanks for your reply - as usual, thorough, with lots to think about.
As far as I can make out, the address for the ANZStat list must have changed - my own email did not go through to it.
On Monday, November 18, 2002, at 02:32 PM, Donald Burrill wrote:
This is certainly the case. However, my uncertainty lies in that the sample on which the variance estimate s^2 is based is not in itself linked to the value of X. A related question is whether or not in the usual application T = (Xbar - mu)/[s/sqrt(n)] the sample SD used, s, has to refer to the same sample as Xbar?[I suspect this reply will not be broadcast to ANZstat, as I am not a member of that list; you may (or not!) want to forward it, Alan.]On Mon, 18 Nov 2002, Alan McLean wrote:Short answer: Yes.I have a couple of questions, one of which has been bubbling round in my mind for some years, the other is more recent. The recent one is the following: The use of the t distribution in inference on the mean is on the whole straightforward; my question relates to the theory underlying this use. If Z = (X - mu)/sigma is ~ N(0, 1), then is T = (X - mu)/s (where s is the sample SD based on a simple random sample of size n) ~ t(n-1)?
Longer answer: the number of degrees of freedom for the t distribution
for such a statistic is the number of degrees of freedom associated with
the variance estimate (well, with its square root) in the denominator.
My second question is on the matter of confidence intervals. <snip> The expression P(Xbar - 1.96 x SE < mu < Xbar + 1.96 x SE) = 0.95 is a perfectly good prediction interval - it expresses the probability of getting a sample mean which satisfies this inequality. Now replace the RV Xbar by the observed sample value to give the interval: xbar - 1.96 x SE < mu < xbar + 1.96 x SE. This is of course the confidence interval on the population mean mu.Minor quibble: Provided "SE" is the population standard error of the mean (and supposing that you're specifying a 95% C.I.), which is consistent with the notation you specified.
That was to be understood.
How can this be? The acceptance region of the test is based on the hypothesised value, while the confidence interval is based on the observed value. This is I think another way of expressing the root of my uncertainty.Whatever is said in the text books, this is understood by most people as a statement that "mu lies in the interval with probability 0.95" - or something very close to this. In effect, we define a secondary notional variable Y which imagines that we could find out the 'true' value of mu; Y = 1 if this true value is in the confidence interval, = 0 otherwise - and we estimate the probability that Y = 1 as 0.95.Interesting concept, I'll have to think about that "notional variable" a bit.I think the earlier approaches that began with hypothesis testing beforeI have been teaching statistics for 30-odd years and have become more and more disillusioned with the treatment of confidence intervals in the text books!
introducing the idea of confidence intervals were superior to what I've
been encountering of late, where C.I.s are introduced first (often, one
suspects, before many students have managed to internalize the idea of
probability at all thoroughly), and hypothesis testing appears later.
A C.I. is an observed instance of a random variable, representing theSo my question is: how do YOU explain to students what a confidence interval REALLY is?
range of values that one might specify in a null hypothesis and NOT have
the hypothesis rejected.
Where an acceptance region (which I hope I'll
have had occasion to explain previously!) is an interval centered on the
value specified in a null hypothesis, and represents the range of
possible values of the sample mean that would NOT lead to rejection of
that hypothesis; a C.I. is an interval centered on the observed sample
mean (which is why it is a value of a random variable: tomorrow, if you
went out and looked again, you'd probably find a different value of the
sample mean, hence a different C.I.), and represents (as above) the
range of values of possible null hypotheses that could NOT be rejected
under the conditions of the current experiment.
I usually added, in introducing a C.I. in the first place, that
sometimes one has an obvious null hypothesis (that is, an obvious null-
hypothetical value of a parameter) to test, and in that circumstance an
hypothesis test is clearly appropriate. But sometimes there isn't an
obvious value to specify for mu (or sigma, or rho, or beta, or ...), and
then one might be interested in knowing what (potential) values of mu
(or whatever) would be consistent with the data in hand.
Is this any help? -- Don.
It certainly stimulates the aging brain cells.... Regards, Alan
-----------------------------------------------------------------------
Donald F. Burrill [EMAIL PROTECTED]
56 Sebbins Pond Drive, Bedford, NH 03110 (603) 626-0816
[was: 184 Nashua Road, Bedford, NH 03110 (603) 471-7128]
. . ================================================================= Instructions for joining and leaving this list, remarks about the problem of INAPPROPRIATE MESSAGES, and archives are available at: . http://jse.stat.ncsu.edu/ . =================================================================
