Thank you very much for your prompt reply and for adding the comments to the help pages for match and ==. I think the source of my confusion was that by looking at the current documentation (v 2.3.0) I did not realize that matching is different from equality testing. (Obviously in the case of using regular expressions, etc, it is different, but I thought that when using plain "match" and %in%, matching would be determined by ==.)
Also I did not mean for my first comment to sound like a criticism of R for treating NAs inconsistently. Nonetheless I am still curious why the particular choice was made that "match" (and therefore %in%) acts differently from "==" with respect to NA's and NaN's (with the default and the only implemented value of the "incomparables" parameter)? Thank you, David On May 28, 2006, at 1:10 AM, Prof Brian Ripley wrote: > You start with very general comments, but only use one specific > function, match (see ?"%in%", a help page entitled `value matching'). > > Matching and equality are treated differently. By definition, NA > matches NA and nothing else, and NaN matches NaN and nothing else. > In comparisons, these values are not comparable. > > As you will have seen from the help page, match() has the expansion > capacity for declaring values non-comparable. That has not been > implemented for a decade and no one has supplied code to implement > it, so it seems no want has much need of it. > > I have added notes to the help pages for match and == to say > explicitly what matches and what is comparable. If the *Draft* R > Language Definition were ever to be finished it would have such > details: it already has a useful commentary. > > On Sat, 27 May 2006, David Soloveichik wrote: > >> I am wondering whether there is a well-accepted approach to handling >> missing values (NA's) in a programming language such as R. For >> example, most functions seem to propagate NA to the output when the >> value of the missing entry could have mattered. In other words, most >> functions are not willing to "take a stand" on what the missing value >> was. However, some functions don't seem to do this. For example, >> >> > c(1,2,3,NA) %in% c(2,3) >> [1] FALSE TRUE TRUE FALSE >> >> rather than: FALSE TRUE TRUE NA >> >> >> Also, what is the logic of the following: >> > c(1,2,3,NA) %in% c(2,3,NA) >> [1] FALSE TRUE TRUE TRUE >> >> Why is the last output value TRUE? Why does R claim that the NA on >> the left hand side of %in% is the same as the NA on the right hand >> side of %in%? > > It does not: it reports that it *matches*. Please do read the help > page bwofre posting, as the posting guide asked you to. > >> PLEASE do read the posting guide! http://www.R-project.org/posting- >> guide.html > > -- > Brian D. Ripley, [EMAIL PROTECTED] > Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ > University of Oxford, Tel: +44 1865 272861 (self) > 1 South Parks Road, +44 1865 272866 (PA) > Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
