Hi,

I have a few comments to add with regards to this as well. This is mainly
from a user's point of view, as I've been having quite a few of these
discussions lately. First, let me provide the context from which I'm
approaching the problem -- the usefulness of any integration (for me) is
based on two types of users: "advanced" and "basic". An advanced user should
already be somewhat familiar with R and be able to script, simply needing an
easy way to organize data and statistical explorations. A basic user, on the
other hand, is one that doesn't know R but needs a few statistical tools for
his/her data analysis (essentially, the user shouldn't even know what R is,
just that Calc is using it to calculate results).

Mr. Mada - with regards to your example, you raise an interesting point. Any
implementation would have to limit an advanced user and force them to
reorganize data to some degree. However, my experience with R integration is
one that makes this pretty flexible, as long as there is a way to represent
Calc data as array. Luckily, I understand this to be the case for Calc. For
a basic user, providing a wizard of some sort would also alleviate this
problem by providing some flexibility and throwing errors (or advice!) about
how to organize data -- sort of like Excel and Calc do now.

Just an aside, it might be worthwhile to provide users (especially basic
ones) with links to tools like Wikibooks (
http://en.wikibooks.org/wiki/Statistics), Wolfram (
http://mathworld.wolfram.com/topics/ProbabilityandStatistics.html), or other
sites where they could learn about the subtleties of the methods they're
using.

Ultimately, if certain objects seem very useful in R and need to be
represented in our integration (such as the discussion with factors), this
may have to be coded into the software to facilitate the user experience.
Maybe one of the first tasks of such an implementation should be to decide
which features are most important (or take from
http://wiki.services.openoffice.org/wiki/Statistical_Data_Analysis_Tool),
and how best to abstract them all from the two types of users above (or
other types one may think of). This could help significantly when it finally
comes down to programming the tools.

Just some thoughts, if this were to go further. :)

Thank you,
Wojciech

On 4/1/07, Leonard Mada <[EMAIL PROTECTED]> wrote:

Dear Prof Neuwirth,

my comments were intended mainly to implement some functionality for
complete *R newbies*, that is for persons who do not know how to work in
R. [It is debatable then if they need factors for example, but for the
sake of completion, I mentioned those elements.]

I imagined the following three scenarios:

1. (for comment c)
I encountered often variables that were NOT limited to a single column:
e.g. age was written both in columns A, B, C and D. Therefore, I would
like to be able to select a range individually for every variable:
age: A1:D50
Similarly, Blood Pressure: H1:I100

The A1:H100 selection won't work in this case, because this is not a
data frame as the same data is split into columns. Rewriting the data
into a single column is time-consuming (and the data is often written in
more than one column because of another factor, so rewriting it into a
single column will loose some information).

2. (comment b)
I had in mind the:
fisher.test( matrix( c(number_1, number_2, number_3, number_4), 2 ))
and similar contingency tables. Somebody who does NOT have any idea of
R, won't be able to perform such a simple test until he learns to
construct a matrix. BUT if the user only needs the fisher test, then ,
you are right, there is no real need for the user to create a matrix on
his own; it is easy to select the contingency table and to pass the
correct command to R (without the user having to know more details about
matrices).

3. (comment a)
The previous comment applies to factors, too. I had primary in mind the
ANOVA test, but then again, the user does not need to know the details
of parsing the arguments and everything could be hidden in the
implementation. Somebody who needs factors for a different analysis,
will most likely know how to create them in R.

Unfortunately, I am a little bit limited when judging RExcel:
- first, I work a lot with R, so I am not a newbie, BUT I try to think
what a newbie might want; this may sometimes backfire
- secondly, I work with R mostly at home; however, I do not use Excel at
home

Well, the issues I described were meant to allow newbies to perform some
specific tests, but it is debatable if such a newbie does indeed need to
know and use those details (those issues could be implemented/hidden in
specific menu commands).

Sincerely,

Leonard


Erich Neuwirth wrote:
> Sorry for not answering earlier.
> You mention some open issues.
> Let me ask some more questions to clarify these.
>
>
>
> Leonard Mada wrote:
>
>> Dear Prof. Neuwirth,
>> depth). Mainly three issues remained uncovered: a.) converting input
>> data/vectors to factors (is useful sometimes),
>>
>
> It you transfer a variable to R and then just apply the function factor
> in R, it is converted to a factor.
> Do you need more functionality?
>
>
> b.) importing data as
>
>> matrices (for contingency tables, e.g. for a Fisher exact test) and
>>
>
> I assume by importing you mean import into the spreadsheet.
> The basic unit for data transfer in our framework is an array
> of a single underlying scalar type. (character, (real) number,
> complex number, time&date). So transferring a matrix in either direction
> is there. There is, however, one major concern.
> Spreadsheet programs don't care too much about missing values.
> I have not looked into the details of this in Calc, but I think
> that it is important to be careful about this problem in the interface.
>
> c.)
>
>> converting a data range to multiple vectors vs independently selecting
>> the data ranges for the multiple vectors.
>>
>
> When a range is transferred as a dataframe, in R you can immediately
> access the columns as vectors. What kind of additional functionality
> do you think is needed?
>
>
>

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]




--

Five Minutes to Midnight:
Youth on human rights and current affairs
http://www.fiveminutestomidnight.org/

Reply via email to