Hi Regina,
I believe this discussion fits better the sc-mailing list.
I will describe below, how I would handle this issue. I would not dwell
deep into the current z-implementation. The z-test is very limited. So,
I would actually want to extend the t-test to cover one group of data.
Calc fares currently bad, as it offers a semi-robust test only for 2
groups of data. So, I would redirect all efforts to extend Calc's
capabilities both to 1 group of data and to >2 groups of data.
Step 1:
Extend the t-test to accept a single group of data.
=TTEST( <range> , <number> , tails = 2 , variance = NULL )
Compares the mean of the data group <range> to the value <number>, using
<x>-tails, and assuming the variance is equal to the variance of <range>.
tails = 2: 2-tailed (alternative: "two.tailed")
tails = 1: one tailed less (alt: "less")
tails = 3: one tailed greater (alt: "greater")
<optional parameter>
variance = NULL: use the variance of <range>
variance = <number>, use this variance instead
Step 2:
Implement ANOVA to cover 2 or more groups of data. I posted some c++
code to issue 4921, see
http://www.openoffice.org/issues/show_bug.cgi?id=4921
[It implements only the simple one-way ANOVA, skipping the block-design.]
=ANOVA( <range> , design = 1)
# EVERY column = one set of data
=ANOVA( <range1> , <range2> , ... , design = 1 )
# EVERY RANGE = one set of data
design = 1: one way ANOVA
design = 2: two way ANOVA (factorial block design)
design = 3: two way ANOVA (randomized block design)
See also http://www.statmethods.net/stats/anova.html
Hi Leonard,
Leonard Mada schrieb:
[...]
I have found one discussion in
http://lists.oasis-open.org/archives/office-formula/200702/msg00047.html
and Eike reminds on it in
http://lists.oasis-open.org/archives/office-formula/200806/msg00050.html
But the spec has still a red ToDo in that place.
See below.
> The z-test is a simplified t-test. So, for groups larger than 30
values,
> it should be quite close to the t-test.
>
> The first thing to strike you is the fact that you can't use in Calc
the
> z-test or the t-test simultanously. This is because, in Calc (I don't
> know of Excel), the t-test works ONLY on 2 groups of data, while the
> z-test works on a SINGLE group of data. This is a design flaw in the
> statistics engine.
I do not see any attempt to change that, not even an issue.
>
> BOTH tests should work both on a single group of data, and on 2 groups
> of data (while the ANOVA works on 2 or more groups of data). This is a
> MAJOR shortcoming of Calc. You can't use a somewhat more robust test
> (t-test) to compare a single group of data against a reference value.
>
> For less than 30 values, the t-test is preferred, and actually is the
> only test in R (there is a special package that has the z-test
> implemented for teaching purposes, I forgot the name but Google will
> probably get it).
>
> You can use more than 30 values and compute the t-test in R. It should
> yield the same results as the z-test, e.g.:
> x<-rnorm(30)
> t.test(x, mu = 0.5)
I don't have R. I have only got Excel and Gnumeric.
R is open source. Google for "R", or go to http://cran.R-project.org,
and you can get R. It runs under almost every platform (support for
Win9x was dropped in the latest R, but I can confirm that it runs on
Win2k). Be warned, the learning curve is steep.
Basics:
Creating a vector:
x<- c( <number1>, <number2> , ... )
30 random numbers:
x<- rnorm(30)
t.test:
t.test( <vector1> , <vector2> ) # two.sided
t.test( <vector1> , <vector2> , "less" ) # one sided, less
t.test( <vector1> , <vector2> , "greater" ) # one sided, greater
t.test ( <vector> , mu = <number> ) # one group of data
# don't forget to write the string 'mu='
# "less" and "greater" apply similarly
There is also a z-test available in package 'TeachingDemos' (you need to
download first this package), see:
http://rss.acs.unt.edu/Rdoc/library/TeachingDemos/html/z.test.html
Sincerely,
Leonard
>
> In this instance, we compare the mean of the sample x against another
> mean mu = 0.5 (don't forget the 'mu', otherwise you get an error).
>
> If the z-test in Calc gives a different result, then it is wrong.
It would be nice to get a test spreadsheet with dummy data and the
results which R returns.
As
> with t-test, z-test can be one-sided or 2-sided, but the standard
should
> be 2-sided.
In the spec it is now 2-sided.
>
> I hope this helps.
Not really. When we will implement ZTest in the 2-sided way, as it is
now defined in the spec, than it would differ from the current behavior.
Therefore going to ODF1.2 there will be a new ZTEST which gives other
results than the old one. How should Calc handle this? Or should we try
to get OASIS to define a 1-sided way? But even than it would be
different from now, because the 1-sided way is not correct implemented
in Excel and Calc; at least I understand the comments on the mailing
list in that way.
kind regards
Regina
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]