Dear list, Subsetting vectors/arrays using factors can be seen as misleading, and I was thinking that it could be discouraged (at least by issuing a warning). I could not find whether this was discussed earlier, but I can be pointed to a reference if I missed any.
The "extract" operator "[" can take as arguments either vectors of integers or vectors of characters in order to subset a data structure. For example: > x <- seq(1, 5) > names(x) <- letters[1:5] > > x[1] a 1 > x["a"] a 1 Using a factor caused some confusion to someone here, and I have to admit that it can indeed appear misleading: > f <- factor("a", levels=c("b", "a", "c")) > f [1] a Levels: b a c > x[f] # here the integer is used, rather than the level b 2 The dual nature of the factor (vector of integers, with an attached vector of levels), is not always clear to many users, especially since factors are treated differently in other situations. Example: > f == 1 [1] FALSE > f == "a" #here the level is used, not the integer [1] TRUE This is making me suggest that indexing using a factor could issue a warning, and the user should explicitly wrap the vector with either "as.integer" or "as.character". L. PS: All examples above were run with platform x86_64-unknown-linux-gnu arch x86_64 os linux-gnu system x86_64, linux-gnu status Under development (unstable) major 2 minor 7.0 year 2008 month 03 day 12 svn rev 44742 language R version.string R version 2.7.0 Under development (unstable) (2008-03-12 r44742) ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel