[R] Removing rows in dataframe w'o duplicated values
Hi, Is there an easy way to remove dataframe rows without duplicated values of a specified column ('id')? e.g., dat - data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) dat id value value2 1 1 5 1 2 1 6 4 3 1 7 3 4 2 4 3 5 3 5 4 6 3 4 3 This is sample data and the real data has hundreds of rows. In this case, only row 4 does not have a duplicated id and I would like to remove it without using: dat$id[4] - NULL Any help is appreciated! AC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing rows in dataframe w'o duplicated values
This is ugly, but it gets what you want. dat[which(dat[,1] %in% unique((dat[duplicated(dat[,1], fromLast = T), 1]))),] AC Del Re wrote Hi, Is there an easy way to remove dataframe rows without duplicated values of a specified column ('id')? e.g., dat - data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) dat id value value2 1 1 5 1 2 1 6 4 3 1 7 3 4 2 4 3 5 3 5 4 6 3 4 3 This is sample data and the real data has hundreds of rows. In this case, only row 4 does not have a duplicated id and I would like to remove it without using: dat$id[4] - NULL Any help is appreciated! AC [[alternative HTML version deleted]] __ R-help@ mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- View this message in context: http://r.789695.n4.nabble.com/Removing-rows-in-dataframe-w-o-duplicated-values-tp4096582p4096672.html Sent from the R help mailing list archive at Nabble.com. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing rows in dataframe w'o duplicated values
Hi: Here's one way: do.call(rbind, lapply(L, function(d) if(nrow(d) 1) return(d))) id value value2 1.1 1 5 1 1.2 1 6 4 1.3 1 7 3 3.5 3 5 4 3.6 3 4 3 HTH, Dennis On Tue, Nov 22, 2011 at 9:43 AM, AC Del Re de...@wisc.edu wrote: Hi, Is there an easy way to remove dataframe rows without duplicated values of a specified column ('id')? e.g., dat - data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) dat id value value2 1 1 5 1 2 1 6 4 3 1 7 3 4 2 4 3 5 3 5 4 6 3 4 3 This is sample data and the real data has hundreds of rows. In this case, only row 4 does not have a duplicated id and I would like to remove it without using: dat$id[4] - NULL Any help is appreciated! AC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing rows in dataframe w'o duplicated values
Sorry, you need this first: L - split(dat, dat$id) do.call(rbind, lapply(L, function(d) if(nrow(d) 1) return(d))) D. On Tue, Nov 22, 2011 at 10:38 AM, Dennis Murphy djmu...@gmail.com wrote: Hi: Here's one way: do.call(rbind, lapply(L, function(d) if(nrow(d) 1) return(d))) id value value2 1.1 1 5 1 1.2 1 6 4 1.3 1 7 3 3.5 3 5 4 3.6 3 4 3 HTH, Dennis On Tue, Nov 22, 2011 at 9:43 AM, AC Del Re de...@wisc.edu wrote: Hi, Is there an easy way to remove dataframe rows without duplicated values of a specified column ('id')? e.g., dat - data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) dat id value value2 1 1 5 1 2 1 6 4 3 1 7 3 4 2 4 3 5 3 5 4 6 3 4 3 This is sample data and the real data has hundreds of rows. In this case, only row 4 does not have a duplicated id and I would like to remove it without using: dat$id[4] - NULL Any help is appreciated! AC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing rows in dataframe w'o duplicated values
one approach is the following: dat - data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) ind - ave(dat$id, dat$id, FUN = length) 1 dat[ind, ] I hope it helps. Best, Dimitris On 11/22/2011 6:43 PM, AC Del Re wrote: Hi, Is there an easy way to remove dataframe rows without duplicated values of a specified column ('id')? e.g., dat- data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) dat id value value2 1 1 5 1 2 1 6 4 3 1 7 3 4 2 4 3 5 3 5 4 6 3 4 3 This is sample data and the real data has hundreds of rows. In this case, only row 4 does not have a duplicated id and I would like to remove it without using: dat$id[4]- NULL Any help is appreciated! AC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. -- Dimitris Rizopoulos Assistant Professor Department of Biostatistics Erasmus University Medical Center Address: PO Box 2040, 3000 CA Rotterdam, the Netherlands Tel: +31/(0)10/7043478 Fax: +31/(0)10/7043014 Web: http://www.erasmusmc.nl/biostatistiek/ __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Removing rows in dataframe w'o duplicated values
On Nov 22, 2011, at 12:43 PM, AC Del Re wrote: Hi, Is there an easy way to remove dataframe rows without duplicated values of a specified column ('id')? e.g., dat - data.frame(id = c(1,1,1,2,3,3), value = c(5,6,7,4,5,4), value2 = c(1,4,3,3,4,3)) dat id value value2 1 1 5 1 2 1 6 4 3 1 7 3 4 2 4 3 5 3 5 4 6 3 4 3 dat[ave(dat$id, dat$id, FUN=length) 1, ] id value value2 1 1 5 1 2 1 6 4 3 1 7 3 5 3 5 4 6 3 4 3 This is sample data and the real data has hundreds of rows. In this case, only row 4 does not have a duplicated id and I would like to remove it without using: dat$id[4] - NULL Any help is appreciated! AC [[alternative HTML version deleted]] __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code. David Winsemius, MD West Hartford, CT __ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.