Re: [R] Conditionally remove rows with logic
try this: > input <- read.table(text = "ID TIME LABEL + 100 + 130 + 160 + 190 + 112 1 + 115 0 + 118 0 + 200 + 230 + 261 + 290 + 212 0 + 215 0 + 218 0", header = TRUE) > > result <- do.call(rbind, + lapply(split(input, input$ID), function(.id){ + indx <- which(.id$LABEL == 1) + if (length(indx) == 1) .id <- .id[1:indx, ] # keep upto the '1' + .id + }) + ) > > > result ID TIME LABEL 1.1 10 0 1.2 13 0 1.3 16 0 1.4 19 0 1.5 1 12 1 2.8 20 0 2.9 23 0 2.10 26 1 > Jim Holtman Data Munger Guru What is the problem that you are trying to solve? Tell me what you want to do, not how you want to do it. On Sun, Aug 7, 2016 at 6:21 PM, Jennifer Sheng wrote: > Dear all, > > I need to remove any rows AFTER the label becomes 1. For example, for ID > 1, the two rows with TIME of 15 & 18 should be removed; for ID 2, any rows > after time 6, i.e., rows of time 9-18, should be removed. Any > suggestions? Thank you very much! > > The current dataset looks like the following: > ID TIME LABEL > 100 > 130 > 160 > 190 > 112 1 > 115 0 > 118 0 > 200 > 230 > 261 > 290 > 212 0 > 215 0 > 218 0 > > Thanks a lot! > Jennifer > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/ > posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditionally remove rows with logic
Assuming that within each ID the data is sorted by increasing TIME, and that LABEL==1 occours only once within each ID. Then I would try something like this. Suppose that your data is in a data frame named "df". df.keep <- logical() for (id in unique(df$ID)) { df.tmp <- subset(df, df$ID==id) tmp.keep <- rep(TRUE, nrow(df.tmp)) tmp.keep[df.tmp$TIME > df.tmp$TIME[df.tmp$LABEL==1]] <- FALSE df.keep <- c(df.keep, tmp.keep) } newdf <- df[df.keep , ] I have not tested this. I'm sure it could be made more efficient, and probably with a bit of cleverness one could avoid creating temporary subsets of the input. But I tend to find such subsets handy for testing and debugging. Unless your input data is huge, it should be fast enough that you won't notice the inefficiencies. -Don -- Don MacQueen Lawrence Livermore National Laboratory 7000 East Ave., L-627 Livermore, CA 94550 925-423-1062 On 8/7/16, 3:21 PM, "R-help on behalf of Jennifer Sheng" wrote: >Dear all, > >I need to remove any rows AFTER the label becomes 1. For example, for ID >1, the two rows with TIME of 15 & 18 should be removed; for ID 2, any rows >after time 6, i.e., rows of time 9-18, should be removed. Any >suggestions? Thank you very much! > >The current dataset looks like the following: >ID TIME LABEL >100 >130 >160 >190 >112 1 >115 0 >118 0 >200 >230 >261 >290 >212 0 >215 0 >218 0 > >Thanks a lot! >Jennifer > > [[alternative HTML version deleted]] > >__ >R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see >https://stat.ethz.ch/mailman/listinfo/r-help >PLEASE do read the posting guide >http://www.R-project.org/posting-guide.html >and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
Re: [R] Conditionally remove rows with logic
Hi Jennifer, A very pedestrian method, but I think it does what you want. remove_rows_after_1<-function(x) { nrows<-dim(x)[1] rtr<-NA rtrcount<-1 got1<-FALSE thisID<-x$ID[1] for(i in 1:nrows) { if(x$ID[i] == thisID && got1) { rtr[rtrcount]<-i rtrcount<-rtrcount+1 } if(x$ID[i] != thisID) { thisID<-x$ID[i] got1<-FALSE } if(x$ID[i] == thisID && x$LABEL[i]) got1<-TRUE } return(rtr) } The function returns the indices of rows to be removed. Jim On Mon, Aug 8, 2016 at 8:21 AM, Jennifer Sheng wrote: > Dear all, > > I need to remove any rows AFTER the label becomes 1. For example, for ID > 1, the two rows with TIME of 15 & 18 should be removed; for ID 2, any rows > after time 6, i.e., rows of time 9-18, should be removed. Any > suggestions? Thank you very much! > > The current dataset looks like the following: > ID TIME LABEL > 100 > 130 > 160 > 190 > 112 1 > 115 0 > 118 0 > 200 > 230 > 261 > 290 > 212 0 > 215 0 > 218 0 > > Thanks a lot! > Jennifer > > [[alternative HTML version deleted]] > > __ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.
[R] Conditionally remove rows with logic
Dear all, I need to remove any rows AFTER the label becomes 1. For example, for ID 1, the two rows with TIME of 15 & 18 should be removed; for ID 2, any rows after time 6, i.e., rows of time 9-18, should be removed. Any suggestions? Thank you very much! The current dataset looks like the following: ID TIME LABEL 100 130 160 190 112 1 115 0 118 0 200 230 261 290 212 0 215 0 218 0 Thanks a lot! Jennifer [[alternative HTML version deleted]] __ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.