Sebastien Bihorel wrote:
Ok, just so as I get that straight, is the 'labelled' class something that you created in your package or a readily available class in base R?

It's something we added for the Hmisc package.
Signing off,
Frank


*Sebastien Bihorel, PharmD, PhD*
PKPD Scientist
Cognigen Corp
Email: sebastien.biho...@cognigencorp.com <mailto:sebastien.biho...@cognigencorp.com>
Phone: (716) 633-3463 ext. 323


Frank E Harrell Jr wrote:
Sebastien Bihorel wrote:
I also realized the flaw after testing the script on various datasets...

Following up on your last note:
1- Is that the reason why the class of integer and regular numeric variable is solely "labelled" following sasxport.get?

Yes. R gurus might correct me but just creating a numeric vector doesn't create a 'hard' class, add adding your own class attribute equal to 'numeric' or 'integer' might cause a problem downstream.

2- Can class be 'soft' for other 'kind' of variables?

Not that I can recall.

3- Would you anticipate the following wrapper function to generate incompatibilities with other R functions?

I'm going to beg off on that. I'm not enough of an expert on the impact of adding such classes.

Frank



SASxpt.get <- function(file, force.single = TRUE,
method=c('read.xport','dataload','csv'), formats=NULL, allow=NULL,
                 out=NULL, keep=NULL, drop=NULL, as.is=0.5, FUN=NULL) {

foo <- sasxport.get(file=file, force.single=force.single, method=method,
                     formats=formats, allow=allow, out=out, keep=keep,
                     drop=drop, as.is=as.is, FUN=FUN)

# For each variable of class "labelled" (and only "labelled"), add the native class as a second class argument

 sglClassVarInd <- which(lapply(lapply(unclass(foo),class),length)==1)

 for (i in 1:length(sglClassVarInd)){
x <- foo[,sglClassVarInd[i]] if (class(x)=="labelled") class(foo[,sglClassVarInd[i]]) <- c(class(x), class(unclass(x)))
 }
 return(foo)
}


*Sebastien Bihorel, PharmD, PhD*
PKPD Scientist
Cognigen Corp
Email: sebastien.biho...@cognigencorp.com <mailto:sebastien.biho...@cognigencorp.com>
Phone: (716) 633-3463 ext. 323


Frank E Harrell Jr wrote:
Sebastien Bihorel wrote:
Thanks a lot Frank,

One last question, though. I was tempted to remove all attributes of my variables after the sasxport.get call using
foo <- sasxport.get(...)
foo <- as.data.frame(lapply(unclass(foo),as.vector))
Since I never worked with the objects of class 'labeled', I was wondering what I will loose by removing this attribute.

Not a good idea, for many reasons including dates and other types.

And the labelled type is need if you subset the data, in order to keep the labels.

Note that your original issue is related to "class" being "soft" for integers and regular numerics:

 x <- 1:3
> attributes(x)
NULL
> class(x)
[1] "integer"
> x <- runif(3)
> class(x)
[1] "numeric"
> attributes(x)
NULL

Frank


*Sebastien Bihorel, PharmD, PhD*
PKPD Scientist
Cognigen Corp
Email: sebastien.biho...@cognigencorp.com <mailto:sebastien.biho...@cognigencorp.com>
Phone: (716) 633-3463 ext. 323


Frank E Harrell Jr wrote:
sebastien.biho...@cognigencorp.com wrote:
The problem is actually not related to a broken command but a attempt of operational qualification of R. A few years ago, my company developed a
set of scripts for the 'operational qualification' of Splus. We are
switching to R so I am currently trying to port the scripts to R.
All Splus scripts imported SAS data using the importData function, which I substituted by sasxport.get. One particular script returns the class of
each variable of the imported data frame; the output must match the
expected values: numeric, factor, integer, etc... The R 'translation' with
sasxport.get is thus problematic.
If there is no easy tweak of the function, we will probably have to remove
this script from our list of 'qualification' scripts.

Although it would be nice

Then my advice is to write your own wrapper function for sasxport.get that takes its output, looks for labelled variables, and adds a new class of your choosing depending on properties of the variable, making sure that you write methods needed for that class (if any). Then test your new function, not sasxport.get explicitly.

Frank


Sebastien Bihorel wrote:
Frank,

It is a non existing issue for me if the variables of class "labelled"
(and only "labelled") can only be numerical variables (integer or
numeric).

Sebastien
'labelled' can apply to any type of vector.  I'm not clear on the
problem this causes you. Please provide a command that is broken by
this behavior.

Frank

Frank E Harrell Jr wrote:
Sebastien Bihorel wrote:
Dear R-users,

The sasxport.get function (from the Hmisc package) automatically
defines the class of imported variables. I have noticed that the
class of theoretically numeric variables is simply "labelled",
although character variables might end up been defined as "labelled"
"Date" or "labelled" "factor".
Is there a way to tell sasxport.get to define numeric variable as
"labelled" "integer" or "labelled" "numeric"?
Sebastien,

If that would fix a problem you're having we could look into it.
Otherwise I'd tend to leave well enough alone.

Frank

Thank you

Sebastien

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Frank E Harrell Jr Professor and Chair School of Medicine Department of Biostatistics Vanderbilt University














--
Frank E Harrell Jr   Professor and Chair           School of Medicine
                     Department of Biostatistics   Vanderbilt University

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to