It is history: r16144 | ripley | 2001-09-28 19:40:28 +0100 (Fri, 28 Sep 2001) | 2 lines
add is.na<-, distinguish NA level and NA codes in factors so predates having NA character strings distinct from "NA". On Tue, 11 Jul 2006, Brahm, David wrote: > I mentioned this in R-help on April 28: > <https://stat.ethz.ch/pipermail/r-help/2006-April/104595.html> > > | as.character.factor contains this line (where cx=levels(x)[x]): > | if ("NA" %in% levels(x)) cx[is.na(x)] <- "<NA>" > | > | Is it possible that this is no longer the desired behavior? These > | two results don't seem very consistent: > | > | > as.character(as.factor(c("AB", "CD", NA))) > | [1] "AB" "CD" NA > | > is.na(.Last.value)[3] > | [1] TRUE > | > | > as.character(as.factor(c("NA", "CD", NA))) > | [1] "NA" "CD" "<NA>" > | > is.na(.Last.value)[3] > | [1] FALSE > | > | I'm using R-2.3.0 on Redhat Linux, but I don't think the behavior > | is new (maybe since character NA's were introduced?). > | > | -- David Brahm ([EMAIL PROTECTED]) > > > -----Original Message----- > From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED] On Behalf Of Peter Dalgaard > Sent: Tuesday, July 11, 2006 5:59 PM > To: J. Hosking > Cc: r-devel@stat.math.ethz.ch > Subject: Re: [Rd] Dropping unused levels of a factor that has "NA" as a level > > "J. Hosking" <[EMAIL PROTECTED]> writes: > > > Is this a bug? > > > > > f1 <- factor(c("a", NA), levels = c("a", "NA") ) > > > f2 <- f1[, drop = TRUE] > > > f2 > > [1] a <NA> > > Levels: a <NA> > > > > I would have expected f2 to have only one level, "a". It seems > > to me that the code in [.factor does not follow the advice in > > help("factor") on how to set factor codes to be missing when > > "NA" is a level of the factor. > > > Something odd is going on, that's for sure... > > The problem is also there with factor(f1). And the logic in > as.character.factor seems to be at the root of it: > > > as.character.factor > function (x, ...) > { > cx <- levels(x)[x] > if ("NA" %in% levels(x)) > cx[is.na(x)] <- "<NA>" > cx > } > > This looks like something from before we had character NA values. I > wonder if it is a mistake or there could actually be a reason to > keep it. > > -- Brian D. Ripley, [EMAIL PROTECTED] Professor of Applied Statistics, http://www.stats.ox.ac.uk/~ripley/ University of Oxford, Tel: +44 1865 272861 (self) 1 South Parks Road, +44 1865 272866 (PA) Oxford OX1 3TG, UK Fax: +44 1865 272595 ______________________________________________ R-devel@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-devel