In article <[EMAIL PROTECTED]>, Stan Brown
<[EMAIL PROTECTED]> writes
>I think I've got some sort of mental block on the following point.
>Can someone explain this to me, plainly and simply, please?
>
>Let me start with a sample problem, NOT created by me:
>
>[The student is led to enter two sets of unpaired figures into
>Excel. They represent miles per gallon with gasoline A and gasoline
>B.
...
>The question is whether there is a difference in gasoline mileage.
>
>The student is led to a two-sample F test for homoscedasticity;
>p=0.1886 so the samples are treated as homoscedastic. Now the
>problem says: ]
>
>"Now the main t-test ... Two Sample Assuming Equal Variances. ...
>Use two-tail results (since '=/=' in Ha). ... What is the P-val for
>the t-test?" [Answer: p=.0002885]
>
>"What's your conclusion about the difference in gas mileage?"
>[Answer: At significance level 5%, previously selected, there is a
>difference between them.]
>
>Now we come to the part I'm having conceptual trouble with: "Have
>you proven that one gas gives better mileage than the other? If so,
>which one is better?"
>
>Now obviously if the two are different then one is better, and if
>one is better it's probably B since B had the higher sample mean.
>But are we in fact justified in jumping from a two-tailed test (=/=)
>to a one-tailed result (>)?
>
>Here we have a tiny p-value, and in fact a one-tailed test gives a
>p-value of 0.0001443.
The significance value associated with the one-tailed test will always
be half the significance value associated with the two-tailed test, so
your cautious strategy will make the same decisions as doing just a two
tailed test.
I tend to think of significance tests as rulebooks designed to ensure
long term false alarm rates if they are followed. The two tailed test
consists of limits at +/- X set to ensure that if the true difference is
zero, you get a false alarm (see a value outside +/-X) with probability
according to the significance level. If the true state of affairs is
that the true difference is (e.g.) 13.0, then you are correct if you
declare that the difference is > 0, and we are implicitly ignoring
errors you make by declaring that the difference is 0. You can only make
an error by declaring that the difference is < 0. But when the true
situation is that the difference is zero you are also making an error if
you say that the difference is < 0. In fact, it is easier to make the
error in this situation, because you are more likely to see a -ve
statistic when the true value is 0 than when the true value is 13. So
the error rate from jumping the wrong way when there is a true
difference is less than the error rate from jumping any way when there
is no true difference, and you are justified in stating the direction of
the distance.
Note that this doesn't generalise very far - see any number of write-ups
about testing for differences after ANOVA has dismissed the hypothesis
that everything is equal to everything else.
--
A. G. McDowell
=================================================================
Instructions for joining and leaving this list and remarks about
the problem of INAPPROPRIATE MESSAGES are available at
http://jse.stat.ncsu.edu/
=================================================================