This assumes that the data are sorted by customer, and that only the first value of Time_Diff is missing for each customer (and that the first value is always missing for each customer). If those assumptions hold you can do something like
A <- read.table(text = "customer Time_Diff flag_1 1 NA 1 1 10 2 1 8 3 1 15 1 1 9 2 1 10 3 2 NA 1 2 2 2 2 5 3", header = TRUE) A$flag_1 <- NULL library(data.table) A <- as.data.table(A) A[ , g15 := cumsum(c(0, ifelse(is.na(diff(Time_Diff > 12)), 0, diff(Time_Diff > 12) > 0)))] ## I'm not proud of the previous line, probably there is a cleaner way A[ , flag_1 := 1:.N, by = c("customer", "g15")] A[ , g15 := NULL] Best, Ista On Sat, Sep 19, 2015 at 5:09 PM, Ravi Teja <raviteja2...@gmail.com> wrote: > Hi, > > I am trying to apply the below logic to generate flag_1 column on a data > set consisting of ~1.2 million records in R. > > Code : > > for(i in 1: nrows) > { > if(A$customer[i]==A$customer[i+1]) > { > > if(is.na(A$Time_Diff[i])) > A$flag_1[i] <- 1 > else if (A$Time_Diff[i] > 12) > A$flag_1[i] <- 1 > else > A$flag_1[i] <- A$flag_1[i-1]+1 > > } > > else > { > > if(is.na(A$Time_Diff[i])) > A$flag_1[i] <- 1 > else if (A$Time_Diff[i] > 12) > A$flag_1[i] <- 1 > else > A$flag_1[i] <- A$flag_1[i-1]+1 > > } > } > > > Resultant dataset should look like > > Customer Time_diff flag_1 > 1 NA 1 > 1 10 2 > 1 8 3 > 1 15 1 > 1 9 2 > 1 10 3 > 2 NA 1 > 2 2 2 > 2 5 3 > > The above logic will take approximately 60 hours to generate the flag_1 > column on a dataset consisting of ~1.2 million records. Is there any > effective way in R to implement this logic in R ? > > Appreciate your help. > > Thanks, > Ravi > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list -- To UNSUBSCRIBE and more, see https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.