This should do the same thing random.del <- function (x, n.keeprows, del.percent){ del<-function(col){ col[sample.int(length(col),length(col)*del.percent/100)]<-NA col } change<-n.keeprows:nrow(x) x[change,]<-lapply(x[change,],del) x }
This is faster because it's vectorized. [1] "Mine" user system elapsed 0.004 0.000 0.002 [1] "Yours" user system elapsed 1.172 0.020 1.193 Tom On Sat, Apr 23, 2011 at 8:37 PM, sneaffer <sneaf...@mail.ru> wrote: > > Hello R-world, > Please, help me to get round my little mess > I have a data.frame in which I'd rather like some values to be NA for the > future imputation process. > > I've come up with the following piece of code: > > random.del <- function (x, n.keeprows, del.percent){ > n.items <- ncol(x) > k <- n.items*(del.percent/100) > x.del <- x > for (i in (n.keeprows+1):nrow(x)){ > j <- sample(1:n.items, k) > x.del[i,j] <- NA > } > return (x.del) > } > > The problems is that random.del turns out to be slow on huge samples. > Is there any other more effective/charming way to do the same? > > Thanks, > Sergey > > -- > View this message in context: > http://r.789695.n4.nabble.com/How-to-erase-replace-certain-elements-in-the-data-frame-tp3470883p3470883.html > Sent from the R help mailing list archive at Nabble.com. > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.