Hi Tal, I always think of factors as a way of imposing (however arbitrarily) order on some variable. To that extent, the key aspect is first, second, third, etc., represented numerically in factors as 1, 2, 3, etc. . The labels are for convenience and interpretation. Consider:
x <- factor(c(5, 4, 6)) y <- factor(c(6, 5, 7)) as.numeric(x) as.numeric(y) Is there numeric or character value of 5 more important? Or is its relative position? If you have character data that you might want to split and manipulate, store it as a string variable (you can set an option so stringsAsFactors = FALSE by default in read.table()). If your factor labels are numeric, that suggests it might have been better stored as numeric in the first place. Generally, when I find myself converting factors to numeric or character class data, it means I've been using factor() to recode data (which is not its intended purpose). My 2 cents. Cheers, Josh On Sat, Dec 11, 2010 at 2:48 PM, Tal Galili <tal.gal...@gmail.com> wrote: > Hello dear R-help mailing list, > > My question is *not* about how factors are implemented in R (which is, if I > understand correctly, that factors keeps numbers and assign levels to them). > My question *is* about why so many functions that work on factors don't > treat them as characters by default? > > Here are two simple examples: > Example one turning the characters inside a factor into numeric: > > x <- factor(4:6) > as.numeric(x) # output: 1 2 3 > as.numeric(as.character(x)) # output: 4 5 6 # isn't this what we wanted? > > > Example two, using strsplit on a factor: > > x <- factor(paste(letters[4:6], 4:6, sep="A")) > strsplit(x, "A") # will result in an error: # Error in strsplit(x, "A") : > non-character argument > strsplit(as.character(x), "A") # will work and split > > > So what is the reason this is the case? > Is it that implementing a switch of factors to characters as the default in > some of the basic function will cause old code to break? > Is it a better design in some other way? > > I am curious to know the reason for this. > > Thank you for your reading, > Tal > > ----------------Contact > Details:------------------------------------------------------- > Contact me: tal.gal...@gmail.com | 972-52-7275845 > Read me: www.talgalili.com (Hebrew) | www.biostatistics.co.il (Hebrew) | > www.r-statistics.com (English) > ---------------------------------------------------------------------------------------------- > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > -- Joshua Wiley Ph.D. Student, Health Psychology University of California, Los Angeles http://www.joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.