Wojciech Gryc wrote:
Hi Leonard,

Thanks very much for your thoughts. First, if you want to try the GUI
package it's all up now on the wiki (
http://wiki.services.openoffice.org/wiki/R_and_Calc).

You raise a lot of good points for missing values. Right now, I only deal
with empty cells -- before a value is retrieved from a Calc cell, the
software checks to see if there is anything written in the cell. If not,
"NA" is inserted, so we avoid the problem of zeros.

That being said, I didn't account for other symbols. I'm trying to see if it
would be possible to create an external file where some of these settings
are stored, so a user can select which values are represented as missing
values... Although would it be safe to say that if a cell has a non-numeric
value, that we're dealing with a missing value?

That is NOT really safe. A *factor* may consist of letters.

Lets say we have some patient data and decide to store the gender as 'm' for male and 'f' for female (because it causes less confusion than assigning some arbitrary numeric values). We would want to import this data as a factor, so the letters are absolutely valid. gender <- as.factor(our.gender.data), where our.gender.data is stored in the spreadsheet as 'm' and 'f' strings.

A different example:
y<-c("a","b","c","d","a","d","c","c","a","c","d","b")
yf<-as.factor(y)
> yf
[1] a b c d a d c c a c d b
Levels: a b c d

We may have 4 categories for something, and choose to represent the 4 categories using a meaningful letter, not an arbitrary number (and also avoid computing various statistics when using numbers).
> is.numeric(yf)
[1] FALSE

So, it is NOT safe to assume that non-numeric entries are 'missing values'.

Sincerely,

Leonard

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to