Hi Regina,
I believe this discussion fits better the sc-mailing list.
I will describe below, how I would handle this issue. I would not dwell
deep into the current z-implementation. The z-test is very limited. So,
I would actually want to extend the t-test to cover one group of data.
Calc fares currently bad, as it offers a semi-robust test only for 2
groups of data. So, I would redirect all efforts to extend Calc's
capabilities both to 1 group of data and to 2 groups of data.
Step 1:
Extend the t-test to accept a single group of data.
=TTEST( range , number , tails = 2 , variance = NULL )
Compares the mean of the data group range to the value number, using
x-tails, and assuming the variance is equal to the variance of range.
tails = 2: 2-tailed (alternative: two.tailed)
tails = 1: one tailed less (alt: less)
tails = 3: one tailed greater (alt: greater)
optional parameter
variance = NULL: use the variance of range
variance = number, use this variance instead
Step 2:
Implement ANOVA to cover 2 or more groups of data. I posted some c++
code to issue 4921, see
http://www.openoffice.org/issues/show_bug.cgi?id=4921
[It implements only the simple one-way ANOVA, skipping the block-design.]
=ANOVA( range , design = 1)
# EVERY column = one set of data
=ANOVA( range1 , range2 , ... , design = 1 )
# EVERY RANGE = one set of data
design = 1: one way ANOVA
design = 2: two way ANOVA (factorial block design)
design = 3: two way ANOVA (randomized block design)
See also http://www.statmethods.net/stats/anova.html
Hi Leonard,
Leonard Mada schrieb:
[...]
I have found one discussion in
http://lists.oasis-open.org/archives/office-formula/200702/msg00047.html
and Eike reminds on it in
http://lists.oasis-open.org/archives/office-formula/200806/msg00050.html
But the spec has still a red ToDo in that place.
See below.
The z-test is a simplified t-test. So, for groups larger than 30
values,
it should be quite close to the t-test.
The first thing to strike you is the fact that you can't use in Calc
the
z-test or the t-test simultanously. This is because, in Calc (I don't
know of Excel), the t-test works ONLY on 2 groups of data, while the
z-test works on a SINGLE group of data. This is a design flaw in the
statistics engine.
I do not see any attempt to change that, not even an issue.
BOTH tests should work both on a single group of data, and on 2 groups
of data (while the ANOVA works on 2 or more groups of data). This is a
MAJOR shortcoming of Calc. You can't use a somewhat more robust test
(t-test) to compare a single group of data against a reference value.
For less than 30 values, the t-test is preferred, and actually is the
only test in R (there is a special package that has the z-test
implemented for teaching purposes, I forgot the name but Google will
probably get it).
You can use more than 30 values and compute the t-test in R. It should
yield the same results as the z-test, e.g.:
x-rnorm(30)
t.test(x, mu = 0.5)
I don't have R. I have only got Excel and Gnumeric.
R is open source. Google for R, or go to http://cran.R-project.org,
and you can get R. It runs under almost every platform (support for
Win9x was dropped in the latest R, but I can confirm that it runs on
Win2k). Be warned, the learning curve is steep.
Basics:
Creating a vector:
x- c( number1, number2 , ... )
30 random numbers:
x- rnorm(30)
t.test:
t.test( vector1 , vector2 ) # two.sided
t.test( vector1 , vector2 , less ) # one sided, less
t.test( vector1 , vector2 , greater ) # one sided, greater
t.test ( vector , mu = number ) # one group of data
# don't forget to write the string 'mu='
# less and greater apply similarly
There is also a z-test available in package 'TeachingDemos' (you need to
download first this package), see:
http://rss.acs.unt.edu/Rdoc/library/TeachingDemos/html/z.test.html
Sincerely,
Leonard
In this instance, we compare the mean of the sample x against another
mean mu = 0.5 (don't forget the 'mu', otherwise you get an error).
If the z-test in Calc gives a different result, then it is wrong.
It would be nice to get a test spreadsheet with dummy data and the
results which R returns.
As
with t-test, z-test can be one-sided or 2-sided, but the standard
should
be 2-sided.
In the spec it is now 2-sided.
I hope this helps.
Not really. When we will implement ZTest in the 2-sided way, as it is
now defined in the spec, than it would differ from the current behavior.
Therefore going to ODF1.2 there will be a new ZTEST which gives other
results than the old one. How should Calc handle this? Or should we try
to get OASIS to define a 1-sided way? But even than it would be
different from now, because the 1-sided way is not correct implemented
in Excel and Calc; at least I understand the comments on the mailing
list in that way.
kind regards
Regina
-
To unsubscribe, e-mail: [EMAIL PROTECTED]
For