Probably, you have to do it by hand.
Exactly, I do not know the reason, but I can imagine.
Once you define factor, the "empty factor" is not meaningless.
The simple way to do it is refactorize:
> f<-factor(1:3)
> f
[1] 1 2 3
Levels: 1 2 3
> factor(f[f!=2])
[1] 1 3
Levels: 1 3
On 2/9/07, Roger Leigh <[EMAIL PROTECTED]> wrote:
> Hi folks,
>
> I am running into a problem when calling subset() on a large
> data.frame. One of the columns contains strings which are used as
> factors. R seems to automatically factor the column when the
> data.frame is contstructed, and this appears to not get updated when I
> create a subset of the table.
>
> A minimal testcase to demonstrate the problem follows:
>
>
> sample <- data.frame(c("A", "A", "A", "A", "B", "B", "B", "C", "C", "C"),
> c(5,3,5,3,6,7,8,3,2,6))
> names(sample) <- c("ID", "Value")
>
> print(sample)
>
> sample.filtered <- subset(sample, ID != "B", select=c(ID, Value))
> # Or sample.filtered <- subset(sample, ID != "B", select=c(ID, Value),
> drop=T)
>
> print(sample.filtered)
>
> plot(sample.filtered)
> plot(sample.filtered$Value ~ sample.filtered$ID)
>
> print(levels(sample.filtered$ID))
> print(levels(factor(sample.filtered$ID)))
>
> plot(sample.filtered$Value ~ factor(sample.filtered$ID))
>
>
> Am I doing something wrong here, or is this an R bug? How can I get
> the new data.frame to update the factors, so I don't get redundant
> "empty" factors on the plot by eliminating the "phantom" factors? (I
> also need to remove the unused factors for other analyses, and
> factoring them "by hand" seems a little redundant.)
>
>
> Kind regards,
> Roger
>
> --
> .''`. Roger Leigh
> : :' : Debian GNU/Linux http://people.debian.org/~rleigh/
> `. `' Printing on GNU/Linux? http://gutenprint.sourceforge.net/
> `- GPG Public Key: 0x25BFB848 Please GPG sign your mail.
>
______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.