> From: F Z > > Hi there > > I have a data frame with about 65,000 rows and 8 variables. > I am trying to > get rid of the double entries of a factor variable "ID" so I > can get a > unique observation for each ID > > I tried: > > >dupl_unique.data.frame(data[ID,]) #I obtain a data frame with 21,547 > >observations..so far so good, but then when I check for duplicates > > >d_duplicated(dupl2$ID) > >summary(as.factor(d)) > FALSE TRUE > 6836 14711 > > Meaning that I am still getting 14,711 duplicates! > > I tried changing the ID type to integer and repeated the > process but I got > dentical results....what am I missing?
1. Upgrade your version of R. (That will teach you about using `_' for assignment!) 2. Call generics, not the methods; i.e., unique() instead of unique.data.frame(). 3. You want a data frame where the IDs are unique, not the combination of columns. Use: dupl <- data[unique(ID),] BTW, where did `dupl2' come from? Andy > Thanks! > > ______________________________________________ > [EMAIL PROTECTED] mailing list > https://www.stat.math.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide! > http://www.R-project.org/posting-guide.html > > ______________________________________________ [EMAIL PROTECTED] mailing list https://www.stat.math.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html