On Thu, Dec 6, 2012 at 10:57 AM, Pedro Giffuni <p...@apache.org> wrote:
> Hi guys;
>
> FWIW, while I was playing with the new random number generator I went
> around looking for some references and I found this paper from the Journal
> of Statistical Software (2010) titled "On the Numerical Accuracy of
> Spreadsheets":
>
> http://www.jstatsoft.org/v34/i04/paper
>

Two other relevant papers:

http://arc.nucapt.northwestern.edu/~karnesky/sdarticle.pdf

http://www.csdassn.org/software_reports/gnumeric.pdf


>
> It basically shows that Calc, among other Spreadsheet programs, is not
> really well suited for statistical analysis.
>
> Something rather amazing is that the major statistic suites have been moving
> towards a more "spreadsheet-like" environment. I am personally a fan of
> Minitab as it brings many functions that I needed for Quality control in a
> previous job. The price of the software package sky-rocketed in few years
> though :(.
>
> One approach could be improving our local functions to match more
> demanding specifications: some of that will necessarily have to be done.
> Another approach could be facilitating interactions with software like R,
>
> and I am aware that approach has many followers. A third approach, which
> I would like to suggest as a future project, would be developing a scaddin
> focused on statistics and making full use of the functions from boost that
> we already have available as a module but we are not using to their full
> extent.
>

So two entirely different questions:

1) Improving the accuracy the statistical (and other numerical
methods) we already have.

2) Extending the range of numerical methods we provide out-of-the-box

I think #1 is a no-brainer, but it does require some expertise.  The
hard part is determining whether we have improved.  For most problems
we probably already get the same results as SPSS, R or other standard
statistical packages.  To really make an improvement we need to test
the edge cases, the "poorly conditioned" and more complex cases.

For #2, it probably makes sense to define a bridge to R.   R is now
the standard and there are hundreds of libraries that extend the
environment.  You can call R routines from SAS or SPPS.  I just got
the new Mathematica 9 upgrade, and guess what?  They've now added the
ability to call R.   So some seamless of calling R routines and
embedding R plots in Calc would be great.

-Rob

> I know we are all busy with other stuff to improve for 4.0 Release, just
> thought I'd leave the idea for the future.
>
> cheers,
>
> Pedro.

Reply via email to