Hi Kohei,

Kohei Yoshida wrote:
Embedding of R into Calc is legally impossible as R is released under
GPL.  However, we could dynamically load it at run time without
violating the license term (but again, IANAL).

Well, with embedding I mean load at runtime the necessary resources (as discussed on the stat wiki page). ;-)

Do you have a reference for this code i.e. is it entirely your own
code, or is it derived from another source (application, book, paper,
etc.)?

The code is a direct port by *ME* of the ANOVA mathematical definition: i.e. it is this definition that is written in all books (and articles) I have read about ANOVA.

HOWEVER: almost all books give then another "practical" formula, which is to calculate the square of residuals using a difference of 2 other terms. *These formulas* (containing a difference) are usually *unstable* (see my previous discussion, and the current implementation of correlation and some other Calc functions to see the adverse effects of a subtraction), in that they can generate impossible values (like negative F statistic).

So, this is essentially my port of the formula. Of course, I took a look at the correlation code (because I had previously NO idea how to access the data in Calc). My other reference was (a very good one): http://courses.ncssm.edu/math/Stat_Inst/PDFS/NEWANOVA.pdf (see also the stat wiki page, ANOVA section, http://wiki.services.openoffice.org/wiki/Statistical_Data_Analysis_Tool#Multiple-Groups_Inference). However, that document describes the matrix approach to ANOVA. Calc does NOT allow matrix calculations, so I transposed the ideas back to simple calculations. [Currently I have implemented only the one-way non-blocked type ANOVA. When the code is complete and functional, I will make the changes for the various flavours of ANOVA, but it makes NO sense now, to have multiple code segments to update.]

One LAST COMMENT:
The code outputs the *F statistic*, NOT the *p Value*.
To obtain the p-value, a call to FDIST('F statistic value', dfB, dfE) must be made, BUT:
- I did NOT figure out yet where this function is and how it's named
- all statistic software outputs both the F statistic AND the p-value
  [and a lot of other data - this is actually meaningful]

So, in the longer term we must also think about more expanded output, because Calc is really limited here. And all modern statistical functions and techniques DO OUTPUT a lot of data, not just a p-value.

Hope this clarifies some issues.

Sincerely,

Leonard Mada

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to