On Mon, 5 Jun 2006, Marc Schwartz (via MN) wrote: > Hi all, > > Based upon an offlist communication this morning, I am somewhat confused > (more than I usually am on most Monday mornings...) about the use of > grep() with factors as the 'x' argument. > > The argument guidance in ?grep indicates: > > x, text a character vector where matches are sought. Coerced to > character if possible. > > and in the Details section: > > Arguments which should be character strings or character vectors are > coerced to character if possible. > > > The wording of both would seem to reasonably lead to the conclusion that > a factor could be coerced to a character vector by the use of > as.character(FACTOR).
Well, that is not what is meant by the wording, nor what happens: there is no method dispatch so the factor is coerced from an integer vector to a character vector. 'coerced' usually means at low level: where as.character() is involved we tend to say so. As for the comments on what happens if value=TRUE: if the 'x' has been coerced, I would expect the value to be based on the coerced value (and it currently is). > grep("1", factor(letters)) [1] 1 10 11 12 13 14 15 16 17 18 19 21 > grep("1", factor(letters), value=TRUE) [1] "1" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "21" So whereas I am quite happy to replace the low-level coercion by method dispatch on as.character, I don't think this should be altered (and am pretty sure there is code out there which expects a character vector result). > In tracing through the C code in character.c for do_grep(), which in > turn calls coerceVector() in coerce.c, unless I am mis-reading the code > (always possible), I don't see an indication that a factor would be > coerced to a character vector. > > Since a factor -> character coercion would seem at face value, the most > logical coercion to take place when using grep(), I am curious if I am > missing something, or if perhaps ?grep needs to be more clear in the > coercions that will or might take place. Perhaps even the consideration > of an error message if a factor is passed as the 'x' argument, if indeed > the coercion would not take place. > > Perhaps the easiest example here might be: > > # On R Version 2.3.1 (2006-06-01) on FC5 > >> grep("[a-z]", letters) > [1] 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 > [23] 23 24 25 26 > >> grep("[a-z]", factor(letters)) > numeric(0) > > > Thanks for any comments or any virtual rotten tomatoes coming my way at > high speed. :-) > > Marc Schwartz > > ______________________________________________ > R-devel@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-devel > > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel