I also realized the flaw after testing the script on various datasets...

Following up on your last note:
1- Is that the reason why the class of integer and regular numeric variable is solely "labelled" following sasxport.get?
2- Can class be 'soft' for other 'kind' of variables?
3- Would you anticipate the following wrapper function to generate incompatibilities with other R functions?


SASxpt.get <- function(file, force.single = TRUE,
method=c('read.xport','dataload','csv'), formats=NULL, allow=NULL,
                 out=NULL, keep=NULL, drop=NULL, as.is=0.5, FUN=NULL) {

 foo <- sasxport.get(file=file, force.single=force.single, method=method,
                     formats=formats, allow=allow, out=out, keep=keep,
                     drop=drop, as.is=as.is, FUN=FUN)

# For each variable of class "labelled" (and only "labelled"), add the native class as a second class argument

 sglClassVarInd <- which(lapply(lapply(unclass(foo),class),length)==1)

 for (i in 1:length(sglClassVarInd)){
x <- foo[,sglClassVarInd[i]] if (class(x)=="labelled") class(foo[,sglClassVarInd[i]]) <- c(class(x), class(unclass(x)))
 }
 return(foo)
}


*Sebastien Bihorel, PharmD, PhD*
PKPD Scientist
Cognigen Corp
Email: sebastien.biho...@cognigencorp.com <mailto:sebastien.biho...@cognigencorp.com>
Phone: (716) 633-3463 ext. 323


Frank E Harrell Jr wrote:
Sebastien Bihorel wrote:
Thanks a lot Frank,

One last question, though. I was tempted to remove all attributes of my variables after the sasxport.get call using
foo <- sasxport.get(...)
foo <- as.data.frame(lapply(unclass(foo),as.vector))
Since I never worked with the objects of class 'labeled', I was wondering what I will loose by removing this attribute.

Not a good idea, for many reasons including dates and other types.

And the labelled type is need if you subset the data, in order to keep the labels.

Note that your original issue is related to "class" being "soft" for integers and regular numerics:

 x <- 1:3
> attributes(x)
NULL
> class(x)
[1] "integer"
> x <- runif(3)
> class(x)
[1] "numeric"
> attributes(x)
NULL

Frank


*Sebastien Bihorel, PharmD, PhD*
PKPD Scientist
Cognigen Corp
Email: sebastien.biho...@cognigencorp.com <mailto:sebastien.biho...@cognigencorp.com>
Phone: (716) 633-3463 ext. 323


Frank E Harrell Jr wrote:
sebastien.biho...@cognigencorp.com wrote:
The problem is actually not related to a broken command but a attempt of operational qualification of R. A few years ago, my company developed a
set of scripts for the 'operational qualification' of Splus. We are
switching to R so I am currently trying to port the scripts to R.
All Splus scripts imported SAS data using the importData function, which I substituted by sasxport.get. One particular script returns the class of
each variable of the imported data frame; the output must match the
expected values: numeric, factor, integer, etc... The R 'translation' with
sasxport.get is thus problematic.
If there is no easy tweak of the function, we will probably have to remove
this script from our list of 'qualification' scripts.

Although it would be nice

Then my advice is to write your own wrapper function for sasxport.get that takes its output, looks for labelled variables, and adds a new class of your choosing depending on properties of the variable, making sure that you write methods needed for that class (if any). Then test your new function, not sasxport.get explicitly.

Frank


Sebastien Bihorel wrote:
Frank,

It is a non existing issue for me if the variables of class "labelled"
(and only "labelled") can only be numerical variables (integer or
numeric).

Sebastien
'labelled' can apply to any type of vector.  I'm not clear on the
problem this causes you.  Please provide a command that is broken by
this behavior.

Frank

Frank E Harrell Jr wrote:
Sebastien Bihorel wrote:
Dear R-users,

The sasxport.get function (from the Hmisc package) automatically
defines the class of imported variables. I have noticed that the
class of theoretically numeric variables is simply "labelled",
although character variables might end up been defined as "labelled"
"Date" or "labelled" "factor".
Is there a way to tell sasxport.get to define numeric variable as
"labelled" "integer" or "labelled" "numeric"?
Sebastien,

If that would fix a problem you're having we could look into it.
Otherwise I'd tend to leave well enough alone.

Frank

Thank you

Sebastien

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
Frank E Harrell Jr   Professor and Chair           School of Medicine
Department of Biostatistics Vanderbilt University









______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to