On Tue, 6 Jun 2006, Marc Schwartz (via MN) wrote: > On Tue, 2006-06-06 at 11:12 +0100, Prof Brian Ripley wrote: >> On Mon, 5 Jun 2006, Marc Schwartz (via MN) wrote: >> >>> Hi all, >>> >>> Based upon an offlist communication this morning, I am somewhat confused >>> (more than I usually am on most Monday mornings...) about the use of >>> grep() with factors as the 'x' argument. >>> >>> The argument guidance in ?grep indicates: >>> >>> x, text a character vector where matches are sought. Coerced to >>> character if possible. >>> >>> and in the Details section: >>> >>> Arguments which should be character strings or character vectors are >>> coerced to character if possible. >>> >>> >>> The wording of both would seem to reasonably lead to the conclusion that >>> a factor could be coerced to a character vector by the use of >>> as.character(FACTOR). >> >> Well, that is not what is meant by the wording, nor what happens: there is >> no method dispatch so the factor is coerced from an integer vector to a >> character vector. 'coerced' usually means at low level: where >> as.character() is involved we tend to say so. >> >> As for the comments on what happens if value=TRUE: if the 'x' has been >> coerced, I would expect the value to be based on the coerced value (and it >> currently is). >> >>> grep("1", factor(letters)) >> [1] 1 10 11 12 13 14 15 16 17 18 19 21 >>> grep("1", factor(letters), value=TRUE) >> [1] "1" "10" "11" "12" "13" "14" "15" "16" "17" "18" "19" "21" >> >> So whereas I am quite happy to replace the low-level coercion by method >> dispatch on as.character, I don't think this should be altered (and am >> pretty sure there is code out there which expects a character vector >> result). > > Prof. Ripley, > > Thanks for your reply and clarification. > > I would acknowledge that the coercion of a factor to its numeric values > would not be immediately intuitive to me (or others who have commented > on this) within the context of grep(). However, in light of your > comments and having reviewed the C code, it does make sense. > > Given this behavior, it would seem reasonable to provide a clarification > in ?grep, perhaps as follows: > > Arguments > > x, text a character vector where matches are sought. Coerced to > character if possible. See Details for factors. > > > Details > > Arguments which should be character strings or character vectors are > coerced to character if possible. In the case of factors, these are > coerced using as.integer(x). You must explicitly coerce the factor using > as.character(x) to use these functions on the character vector > equivalent.
I do think we should `replace the low-level coercion by method dispatch on as.character', and have done so in R-devel (but am still testing packages). There have been quite a few instances of such low-level coercion (including for dimnames), and I am currently looking through to see if there are any others that either should be altered or the documentation clarified. -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel