Paige Miller wrote:
> 
> [EMAIL PROTECTED] wrote:
> > I ran a z test on two sets of scores using the Excel data analysis
> > tool to seek to determine whether the difference in the means were
> > statistically significant.  What I do not understand is how to read
> > the results of the z test Excel returns.  It give me a "z", "P" one
> > tail and "P" two tail.  Can anyone explain to me how to interpret
> > these results?  Thanks.
> 
> If you did the z-test by hand, how would you interpret the z-value? The
> results from Excel can be interpreted the exact same way.

        Paige:  The fact that "dbmail" did a z test is almost certainly
evidence that he/she has had little background in statistics (or learned
it in a day when one didn't "interpret p values" but compared test
statistics to a table of 2.5th percentiles); so that comment isn't going
to help.

        Dbmail (should I call you D.B. for short?  This isn't Dear Abby, you're
allowed to give your real name): Firstly, the Z test is inappropriate
unless you know the variance of the population you are testing a sample
from, not deducing it from the data. This situation rarely occurs.  In
particular, if the software did not prompt you to type in the standard
deviation, it *was* deducing it from the data.

        Many years ago people used to use a "Z test" that was just a t test
done with the Z tables whenever thay had more than 30 data. This was
done because even longer ago t tables hadn't been invented, or because
of insecurity about using the "50 degrees of freedom" line when you have
55 degrees of freedom (=56 data).  For some reason people have been very
concerned about this; most t tables out there have far more rows than
needed. There is one (Milton, McTeer,and Corbet) that has 101 different
rows, some differing only in the third decimal place(!)

        However, the common way of dealing with this concern - using the Z test
instead, with computed standard deviation - was completely irrational.
It was, in fact, equivalent to using the nu=infinity row of the t table!
So what these people were saying was really:

        "I have 55 degrees of freedom and I don't want to round it to 50 or to
60 so I'll round it to infinity instead."

        Now, this didn't make a huge difference, because above about fifty
degrees of freedom not a lot changes in the t table. But it makes no
sense. It would be like saying "I'll turn all the sixes in the last
decimal place into nines" (and telling students to do the same). It
complicates things, makes them a tiny bit wrong, and has no advantages.

        One possible other contributing factor is the fact that Z tables give
many, many probabilities, whereas most easily accessible (="in the back
of the text") tables give six to ten probabilities (see my article "A
t-table for today")

        http://www.amstat.org/publications/jse/v5n2/dawson.

This does mean that a Z test could be reported as a p value, not as
"accept/reject", while a t test could usually not be. With computers,
however, that's becoming a non-issue.

        Anyhow: How you read the results (once you've redone it as a t test):
The t statistic is a measure of how many standard errors away from the
hypothesized mean the observed mean is.  Report it.

        The 2-tailed p value is a measure of how often you would see a t
statistic this large purely due to sampling variation _if_ the null
hypothesis were true. If it is very small this is evidence for something
other than sampling variation being behind the difference.  Report this.

        Traditionally 5% was considered to be the cutoff between "statistical
significance" (p<5%) and "statistical unsignificance" (p>5%) but this is
bogus, like saying that any man 6'0" tall or taller is "tall" and any
other man is "not tall". It is an artificial dichotomy. It should also
be noted that the p-value is (unless the null is exactly true)
controlled by sample size - with a bigger sample size your p-value
drops. It measures strength of evidence, not truth.

        The 1-tailed p-value has applications in quality control and safety
testing.  Basically, it is a measure of how often you would see a t
statistic this large _and_in_this_direction_, purely due to sampling
variation _if_ the null hypothesis were true.  There have been
suggestions that it can and should be used in research - I would argue
strongly against this. It seems like a great deal - you get to halve
your p-value AND you never need to report an effect in the direction
opposite to the one you were hoping for - but neither of these is
honest. The "value" of reducing the p-value lies only in the hope that
the ignorant will compare it with a two-tailed p-value and be
impressed.  Ignore this number completely in research contexts, and do
not report it.

        You might also consider reporting a t confidence interval for the mean.
This is an interval constructed by a method that has the property that
95% of the intervals it constructs contain the true population mean. (It
does NOT mean that there is a 95% probability, after the data have been
plugged in, that the true mean lies within that interval!)

                -Robert Dawson
.
.
=================================================================
Instructions for joining and leaving this list, remarks about the
problem of INAPPROPRIATE MESSAGES, and archives are available at:
.                  http://jse.stat.ncsu.edu/                    .
=================================================================

Reply via email to