Dear Eike, dear Regina,
I will try to explain the rationale behind the z-test.
Unfortunately, the quirks behind its computation in spreadsheet
software make it not that easy to describe.
The assumptions of the z-test:
- you have a random sample with a mean Xs
- there is a population that follows a gaussian distribution
with mean muP and variance sigma^2
- the question is whether this sample was drawn from this population
- the statistical hypothesis are:
H0: Xs = muP
Ha: Xs != muP (2-tailed)
The one tailed version Ha is:
Xs either < muP or
Xs > muP (only one of these)
So, basically, what we are testing is the probability
that the sample was taken from a population with mean muP.
The z test will first compute the z-statistic, and then
will infer the probability of H0 based on this z statistic.
[There is a direct correspondence between z and the probability.]
So, the 2-tailed version looks like:
- if computed z is more extreme than a critical z0, then
we have to reject H0
- else, we have to accept H0
More extreme means:
either z < -|z0| or z > |z0|, where |...| is the absolute value;
We compare 2 z values only when we talk about
interpreting the z-statistic.
Otherwise, we do not compare 2-values.
The z-test simply gives us the probability, under the null
hypothesis, to observe a z-statistic as extreme or more extreme
than that calculated, or, as written on MathWorks:
"The p-value is the probability, under the null hypothesis,
of observing a value as extreme or more extreme of the
z test statistic..." (slightly reworded)
Or, still in other words:
we obtain the probability to observe in a random sample from
the given study population a z statistic as extreme as
that calculated. [This is the meaning of the p-value.]
This sounds good, and is easily understandable.
The problem with the z-test implementation in spreadsheets
(I infer the implementation details from previous posts,
I did not test it specifically), is that a different
probability is computed, namely:
the probability of observing a z > computed z for this sample.
Statistically, this is the "one-sided" "greater" alternative.
But this is not as easy to explain if you do not understand
statistics.
So basically, we compute the probability, under the null
hypothesis, to observe a z-statistic greater than the one
computed.
"Under the null hypothesis" means to observe such a value
by chance alone (aka randomly).
The shortest definition that still makes some sense is:
The probability of a z-statistic greater than the one computed.
[where computed is based on the sample]
I hope this sounds English enough, but unfortunately neither I
am a native speaker. I would have welcomed some input from anyone
speaking natively English.
Sincerely,
Leonard Mada
A last note:
I understand what was meant in the previous definition
with a second sample, but I found that explanation very
confusing, because we never take a 2nd sample. Also, the
"2" samples are never compared.
[A 2nd sample would also cause a lot of trouble because
of 2 means and 2 distinct variances. The actual reasoning
refers to H0 and goes like this:
We draw a hypothetical random sample, and compute
the probability to get a z-statistic as extreme or more
extreme than that observed with our real sample. It is
the probability of drawing such a sample, not of comparing
2 samples.]
-------- Original-Nachricht --------
> Datum: Wed, 26 Aug 2009 22:35:29 +0200
> Von: Eike Rathke <[email protected]>
> An: [email protected]
> Betreff: Re: [sc-dev] Re: [Issue 90759] ZTEST not same as Excel
> Hi Regina,
>
> On Wednesday, 2009-08-26 18:02:44 +0200, Regina Henschel wrote:
>
> >> "calculates the probability of observing a value as large
> >> or larger for the z-statistic"
> >
> > There is a comparison "observing a value larger" but it does not
> > contain, to what it is compared. There must be something like "observing
>
> > a value larger than ...".
> >
> > I think "as large as..." can be dropped, it makes no difference for a
> > continuous distribution and the text becomes shorter.
> >
> > Is "for the z-statistic" an attribute to "a value"? I understand it so.
> > Is it a typical sentence order in English to put it at the end?
> >
> > In German I would say "Berechnet die Wahrscheinlichkeit einen Wert der
> > Gauß-Statistik zu beobachten, der größer ist als der Wert der
> > Gauß-Statistik der Stichprobe." But I'm not sure, Leonardo wants to say
>
> > this. ("Z-Statistik" does not exist in German.)
>
> Translating that I'd get, hopefully correct:
>
> "Calculates the probability of observing a value of the z-statistic
> larger than the value of the sample's z-statistic."
>
> Is that what we want to say?
>
> > Describing the function using 'z-statistic' is indeed better than using
> > a description with 'mean', because of the function name ZTEST.
>
> I agree.
>
> Eike
>
> --
> OOo/SO Calc core developer. Number formatter stricken i18n
> transpositionizer.
> SunSign 0x87F8D412 : 2F58 5236 DB02 F335 8304 7D6C 65C9 F9B5 87F8 D412
> OpenOffice.org Engineering at Sun: http://blogs.sun.com/GullFOSS
> Please don't send personal mail to the [email protected] account, which I use
> for
> mailing lists only and don't read from outside Sun. Use [email protected]
> Thanks.
--
Jetzt kostenlos herunterladen: Internet Explorer 8 und Mozilla Firefox 3 -
sicherer, schneller und einfacher! http://portal.gmx.net/de/go/atbrowser
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]