On Tue, 2006-05-16 at 14:37 -0400, Guenther, Cameron wrote: > Hello everyone, > > I have a large dataset (x) with some rows that have duplicate variables > that I would like to remove. I find which rows are the duplicates with > X1<-which(duplicated(x)). That gives me the rows with duplicated > variables. Now, how can I remove just those rose from the original data > frame. I think I can create a new data frame without the duplicates > using subset. I have tried: > Subset(x,!x1) and subset(x,!x[x1,]) > I can't seem to find the correct syntax. Any advice. > Thanks in advance
Even easier would be to use unique(): NewDF < unique(x) NewDF will contain rows from 'x' with duplicates removed. See ?unique for more information. unique(), which has a data.frame method, is basically: x[!duplicated(x), , drop = FALSE] which covers the case where the result may contain a single row and which remains a data frame. Note that the above presumes that you want to test all columns in 'x' for dups. HTH, Marc Schwartz ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
