Hi Mark,

I frequently need to do that when importing data. This one-liner works:

> data.frame(mapply(as, x, c("integer", "character", "factor"), 
> SIMPLIFY=FALSE), stringsAsFactors=FALSE);

but it has two problems:

1) as() is an S4 method that does not always work 
2) writting the vector of classes for 60 variables is rather tedious.

Both issues can be solved with the following two helper functions. The first 
function tries to use as(x, class); if it doesn't work, tries as.<class>(x); If 
it still doesn't work, tries <class>(x). The second function tranforms a single 
string to a character vector of classes, by transforming each letter in the 
string to a class name (i.e. "D" is tranformed to "Date", "i" to "integer", 
etc.), so that writting 60 classes is fast.

doCoerce <- function(x, class) {
        if (canCoerce(x, class)) 
                as(x, class)
        else {
                result <- try(match.fun(paste("as", class, sep="."))(x), 
silent=TRUE);
                if (inherits(result, "try-error"))
                        result <- match.fun(class)(x)
                result;         
    }
}

expandClasses <- function (x) {
    unknowns <- character(0)
    result <- lapply(strsplit(as.character(x), NULL, fixed = TRUE), 
        function(y) {
            sapply(y, function(z) switch(z, 
                        i = "integer", n = "numeric", 
                l = "logical", c = "character", x = "complex", 
                r = "raw", f = "factor", D = "Date", P = "POSIXct", 
                t = "POSIXlt", N = NA_character_, {
                  unknowns <<- c(unknowns, z)
                  NA_character_
                }), USE.NAMES = FALSE)
        })
    if (length(unknowns)) {
        unknowns <- unique(unknowns)
        warning(sprintf(ngettext(length(unknowns), "code %s not recognized", 
            "codes %s not recognized"), dqMsg(unknowns)))
    }
    result
}

An example:

> x <- data.frame(X="2008-01-01", Y=1.1:3.1, Z=letters[1:3])
> data.frame(mapply(doCoerce, x, expandClasses("Dif")[[1L]], SIMPLIFY=FALSE), 
> stringsAsFactors=FALSE);

Regards,

Enrique


------------------------------

Message: 99
Date: Tue, 23 Jun 2009 15:23:54 -0600
From: Mark Na <mtb...@gmail.com>
Subject: [R] Apply as.factor (or as.numeric etc) to multiple columns
To: r-help@r-project.org
Message-ID:
        <e40d78ce0906231423m4c3da14i2f6270f92463c...@mail.gmail.com>
Content-Type: text/plain; charset=ISO-8859-1

Hi R-helpers,

I have a dataframe with 60columns and I would like to convert several
columns to factor, others to numeric, and yet others to dates. Rather
than having 60 lines like this:

data$Var1<-as.factor(data$Var1)

I wonder if it's possible to write one line of code (per data type,
e.g. factor) that would apply a function (e.g., as.factor) to several
(non-contiguous) columns. So, I could then use 3 or 4 lines of code
(for 3 or 4 data types) instead of 60.

I have tried writing an apply function, but it failed.

Thanks for any help you might be able to provide.

Mark Na

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to