I thought that the default was the way it was for performance reasons. For large data.frames or repeated applications, using factors should be faster for non-trivial strings.
> fs <- c('apple','peach','watermelon','spinach','persimmon','potato','kale') > n <- 1000000 > > a1 <- data.frame(f=sample(fs,n,replace=TRUE), x1=rnorm(n), x2=rnorm(n), > stringsAsFactors=TRUE) > a2 <- data.frame(f=sample(fs,n,replace=TRUE), x1=rnorm(n), x2=rnorm(n), > stringsAsFactors=FALSE) > > fn <- function(i,x) x[x$f %in% c('kale','spinach'),] > system.time(z <- sapply(1:100, fn, a1)) user system elapsed 19.614 4.037 24.649 > system.time(z <- sapply(1:100, fn, a2)) user system elapsed 19.726 7.715 36.761 On Feb 12, 2013, at 10:40 AM, Ben Bolker <bbol...@gmail.com> wrote: > > Thanks, Uwe. > Now let me go one step farther. > > Can you (or anyone) give a good argument **other than backward > compatibility** for keeping the stringAsFactors=TRUE argument on > data.frame()? > > I appreciate your distinction between data.frame() and read.table()'s > use of stringAsFactors, and I can see that there is some point for > quick-and-dirty interactive use in setting all non-numeric variables to > factors (arguing that wanting non-numerics as factors is somewhat more > common than wanting them as strings). > > It might be nice to add an optional stringsAsFactors (and check.names) > argument to transform(): I've had to write my own Transform() function > to allow the defaults to be overridden, since transform() calls > data.frame() with the defaults. (Setting the stringsAsFactors option > globally would work, although not for check.names.) > > Ben BOlker > >> >>> >>>> What I will likely do is >>>> make a few changes so that character vectors are automatically changed >>>> to factors in modelling functions, so that operating with >>>> stringsAsFactors=FALSE doesn't trigger silly warnings. >>>> >>>> Duncan Murdoch >>>> >>> >>> [apologies for snipping context: "gmane made me do it"] >>> >>> ______________________________________________ >>> R-devel@r-project.org mailing list >>> https://stat.ethz.ch/mailman/listinfo/r-devel >>> > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel